LangilleLab / microbiome_helper

A repository of bioinformatic scripts, SOPs, and tutorials for analyzing microbiome data.
GNU General Public License v3.0
433 stars 205 forks source link

Added script to fix QIIME2-based taxonomy SPF files #38

Closed gavinmdouglas closed 5 years ago

gavinmdouglas commented 5 years ago

Added script that replaces any taxon containing any of these strings (case insensitive) with “Unclassified”: uncultured, ambiguous_taxa, metagenome, unidentified

Any taxa containing “unknown” (case insensitive) will be replaced with the preceding taxon label followed by “X”, but the correct “D” level.

I.e. D_2Oxyphotobacteria D_3__Oxyphotobacteria Incertae Sedis D_4Unknown Family

Becomes:

D_2Oxyphotobacteria D_3__Oxyphotobacteria Incertae Sedis D_4Oxyphotobacteria Incertae SedisX