Open fconstancias opened 6 years ago
Hi Flo,
I don't think there is an easy way to reduce FP here besides looking at the whole genome coverage again by doing some alignment based on centrifuge reported hits.
But perhaps https://github.com/fbreitwieser/krakenhll might be worthy to check.
Cheers, Leon
Hi Leon,
Thanks for your help. I have already started to explore krakenhll. I have just seen this tool https://github.com/seqan/slimm which uses genomes coverages informations to filter out the noise. Makes a lot of sense to me.
Cheers,
Flo
Another alternative for filtering (and visualizing) Centrifuge results by score, length, log length or score/length is Recentrifuge. You don't need negative control samples to use it in this way, but if you have one of more of them, Recentrifuge can go further.
Hi all,
Thanks a lot for your efforts developing this tool. I am using centrifuge for taxonomic profiling of metagenomes from various ecosystems and I am currently building databases for bacteria, viruses, fungal and archea using your centrifuge-download tool.
In order to reduce detection of potential false positive taxa could you help me to adopt a rationale approach to filter centrifuge classification output using the parameters such as hitLength, score.
Is there any way to get a breadth of coverage for each taxa. This might be also a good way to get rid of potential false positives.
If you have any guidance, suggestion.
Thanks a lot.
Flo