monagrland / MB_Pipeline

Metabarcoding Pipeline for Illumina Sequencing Data
GNU Affero General Public License v3.0
1 stars 1 forks source link

multilvl_taxonomic_classification.py script fails if taxonomy strings truncated #12

Open kbseah opened 1 year ago

kbseah commented 1 year ago

"all arrays must be the same length"

Script appears to fail if SINTAX-formatted headers in the reference database are not padded with placeholder names for missing lower taxonomic ranks. However such padding is not necessarily desirable because it can lead to under-classification if known species are not identified fully in the reference database.

kbseah commented 1 year ago

Also: let Snakemake handle the vsearch --usearch_global commands instead of embedding them in the script?