mbhall88 / tbpore

Mycobacterium tuberculosis genomic analysis from Nanopore sequencing data
MIT License
11 stars 2 forks source link

Add option to set decontamination database index path #42

Closed mbhall88 closed 2 years ago

mbhall88 commented 2 years ago

Addresses https://github.com/mbhall88/tbpore/issues/34#issuecomment-1271666527.

Also, update mykrobe to v0.12 as this uses the newest panel that includes the WHO catalogue

codecov-commenter commented 2 years ago

Codecov Report

Merging #42 (a26dd8a) into main (19fcc3b) will decrease coverage by 1.53%. The diff coverage is 50.00%.

@@            Coverage Diff             @@
##             main      #42      +/-   ##
==========================================
- Coverage   92.66%   91.12%   -1.54%     
==========================================
  Files           7        7              
  Lines         327      338      +11     
  Branches       43       47       +4     
==========================================
+ Hits          303      308       +5     
- Misses         16       19       +3     
- Partials        8       11       +3     
Flag Coverage Δ
unittests 91.12% <50.00%> (-1.54%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
tbpore/tbpore.py 89.22% <50.00%> (-3.09%) :arrow_down:

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

leoisl commented 2 years ago

Argh, just noticed the decontamination DB is composed of two files, the large minimap2 index and the metadata about the indexed genomes (remove_contam.tsv.gz). So if we want to be precise, the --db option should point to a dir with these two files.

If the user specified a different decontamination index (not just a different location, an actual different index), then data/decontamination_db/remove_contam.tsv.gz might not have the correct metadata for the genomes in this different index. It would be even complicated to update: the user would need to update this metadata file data/decontamination_db/remove_contam.tsv.gz, while the index lies somewhere else, and I think this update might not be possible in a container? However, I don't think we have this use case now, we are always using the same decontamination DB, but the current --db option implementation introduce this bug if a different index is used...

mbhall88 commented 2 years ago

True....

We could also add a metadata option? And just document that if the decontamination index you're using is not the one downloaded from us, then you should also provide a metadata file?