bluenote-1577 / sylph

ultrafast taxonomic profiling and genome querying for metagenomic samples by abundance-corrected minhash.
MIT License
185 stars 6 forks source link

Can it replace kraken2 for pathogen identification? #1

Closed yi1873 closed 10 months ago

yi1873 commented 11 months ago

I have tested using a 20M 50-bp sequencing fastq for pathogen identification, but I didn't get any results. Is this software only suitable for applications with large data volumes such as gut microbiota?

bluenote-1577 commented 11 months ago

Hi @yi1873,

The important difference between kraken and sylph is that kraken classifies individual reads, but sylph does not classify every read. Sylph requires some coverage over the genome in order to detect it. So the coverage of your genome is more important than data volume.

What pathogen are you detecting? For bacteria, sylph works with about ~ 0.1x coverage, sometimes lower, sometimes higher. For viruses, sylph requires higher coverage. So if you are detecting E. coli, you want about >400kbp of E. coli reads in total.

Thanks,

Jim