ohnosequences / mg7

Configurable and scalable 16S metagenomics data analysis
https://goo.gl/y3rZFD
GNU Affero General Public License v3.0
3 stars 3 forks source link

Accumulated counting algorithm is inconsistent #62

Closed laughedelic closed 8 years ago

laughedelic commented 8 years ago

Some inconsistencies in the accumulated counts were discovered.

laughedelic commented 8 years ago

I've checked the lineage (using Bio4j locally) of these three, which had same direct and accumulated counts (the first one haven't accumulated the descendants' counts):

lineage (IDs)
Streptococcus agalactiae 1, 131567, 2, 1783272, 1239, 91061, 186826, 1300, 1301, 1311
Streptococcus agalactiae FSL S3-608 1, 131567, 2, 1783272, 1239, 91061, 186826, 1300, 1301, 1311, 1154778
Streptococcus agalactiae H36B 1, 131567, 2, 1783272, 1239, 91061, 186826, 1300, 1301, 1311, 342615

And you can see that the lineage is correct. So I'm going to test the count algorithm and see what's the problem with it.

laughedelic commented 8 years ago

I think I know what's the problem, but I'm still writing some simple tests.

laughedelic commented 8 years ago

This is supposedly fixed. We can try it on the mock data.

laughedelic commented 8 years ago

This fix is published under 1.0.0-M3-5c435ad version.

laughedelic commented 8 years ago

OK. After the testing (https://github.com/era7bio/mg7-test/issues/12#issuecomment-213997955) it seems to be fixed. Merging.