poeli / GOTTCHA

More details and updates can be found in our homepage and LANL-Bioinformatics Github site (https://github.com/LANL-Bioinformatics/GOTTCHA). Please visit our homepage at
http://lanl-bioinformatics.github.io/GOTTCHA
GNU General Public License v3.0
8 stars 10 forks source link

"**WARNING**: GOTTCHA database inconsistency..." in test set #2

Open jonathanjacobs opened 9 years ago

jonathanjacobs commented 9 years ago

Paul / Tracy

FYI - when running GOTTCHA on a new install today I noticed the test.gottcha.log ballooned to 2.5MB in size while running the test data. It was filled with thousands of entries like this:

WARNING: GOTTCHA database inconsistency! Mapping entry has no match in reference [gi|238859724|gb|CP001396.1|001993456|001997781|]

Should that be a concern? (i'm thinking... yes?) the test.gottcha.tsv file looks like this:

LEVEL TAXA REL_ABUNDANCE LINEAR_LENGTH TOTAL_BP_MAPPED HIT_COUNT HIT_COUNT_PLASMID READ_COUNT LINEAR_DOC NORM_COV phylum Proteobacteria 1.0000 209107 598234 18369 72 6198 2.86089896560134 1 class Gammaproteobacteria 1.0000 209107 598234 18369 72 6198 2.86089896560134 1 order Enterobacteriales 1.0000 209107 598234 18369 72 6198 2.86089896560134 1 family Enterobacteriaceae 1.0000 209107 598234 18369 72 6198 2.86089896560134 1 genus Escherichia 1.0000 209107 598234 18369 72 6198 2.86089896560134 1 species Escherichia coli 1.0000 209107 598234 18369 72 6198 2.86089896560134 1

poeli commented 9 years ago

Hello Jonathan,

Thank you so much for your input. It's a bug that effects the name parsing. I have pushed a fix to our github.

Paul