cdiener / architeuthis

Tools to analyze and summarize data for Kraken2.
https://cdiener.github.io/architeuthis
Apache License 2.0
1 stars 0 forks source link

Filter is incompatible with Kraken2's "use-names" parameter #2

Closed zoey-rw closed 1 month ago

zoey-rw commented 1 month ago

Hi! Thanks for developing this tool!

It seems that with Kraken2's "use-names" parameter, which replaces the taxID column with the scientific name and the taxID in parentheses, the filter commands cannot parse the taxID from the Kraken2 output files. I ran Kraken2 without "use-names" and the error below disappeared.

It might be helpful for the documentation to include an example Kraken2 call, or just a note about avoiding the "use-names" flag. (or the architeuthis parser could look for taxIDs within parentheses? but that could be a problem if the scientific names have parentheses as well)

(struo2)[zrwerbin@geo Struo2]$ architeuthis mapping filter $sample_output --db $DBDIR --data-dir $DB_taxonomy_dir --out $filtered_sample_report 
2024/07/11 14:35:08 Found taxonkit=0.17.0.
2024/07/11 14:35:08 Pass 1: Building the taxa database...
2024/07/11 14:35:08 Reading k-mer assignments from /projectnb2/talbot-lab-data/zrwerbin/soil_genome_db/misc_scripts/mock_community/kraken_results/soil_genome_db/mock_community.output.
2024/07/11 14:35:11 Processed 1000000 reads...
2024/07/11 14:35:14 Processed 2000000 reads...
2024/07/11 14:35:15 Processing 2464905 reads - Done.
2024/07/11 14:35:18 Pass 2: Score individuals reads...
2024/07/11 14:35:18 Reading k-mer assignments from /projectnb2/talbot-lab-data/zrwerbin/soil_genome_db/misc_scripts/mock_community/kraken_results/soil_genome_db/mock_community.output and writing t
o /projectnb2/talbot-lab-data/zrwerbin/soil_genome_db/misc_scripts/mock_community/kraken_results/soil_genome_db/filtered_mock_community.kreport.
2024/07/11 14:35:18 Could not parse taxon ID chromosome_7_0_0 for read Eremothecium gossypii ATCC 10895 (taxid 284811).
cdiener commented 1 month ago

Oh yeah good point. Wouldn't be that hard to support that. I will add this.

cdiener commented 1 month ago

Should now work with version 0.3.0. Let me know if it does not.