Open durrantmm opened 5 years ago
Hi Matt,
I've made a few changes to the default parameters for IGGsearch, which should produce more accurate results out of the box. Please give it another try and let me know how the results look.
However, I think that a lot of differences are to be expected since the two methods use different strategies (universal genes in MIDAS vs species specific genes in IGGsearch), have different numbers of species (5900 total species in MIDAS versus 23790 species in IGGsearch), and used slightly different approaches for defining species (96.5% identity across marker genes for MIDAS versus 95% genome wide ANI over at least 20% of the genome for IGGsearch).
As for B. uniformis (OTU-04728), looking at the database file iggdb_v1.0.0_gut/iggdb_v1.0.0_gut.species
I see that species has zero marker genes which explains why it was not reported by IGGsearch. Of the 4,558 of gut species in the IGGsearch database, 99% have at least 1 marker gene and 95% have at least 10. Unfortunately B. uniformis and several other common species are among the 1% with no marker genes.
I will look into why I was unable to identify marker genes for B. uniformis and try to add these genes to the database in the near future.
Thanks, Stephen
I decided to compare the output of IGGsearch with the output of MIDAS using the default database.
I have noticed that the two approaches return quite different results.
Here are the top results for IGGsearch:
And the top results for MIDAS:
You can see that the most abundant microbe according to MIDAS is B. uniformis with 257x coverage. IGGsearch does not show B. uniformis anywhere in its top results.
What do you think may be explaining the difference between the two tools?