pirovc / grimer

GRIMER performs analysis of microbiome studies and generates a portable and interactive dashboard integrating annotation, taxonomy and metadata with focus on contamination detection.
https://pirovc.github.io/grimer/
MIT License
20 stars 3 forks source link

List of contaminants #4

Closed MjelleLab closed 1 year ago

MjelleLab commented 1 year ago

Hi, thanks for developing a nice tool! I wonder if you have the species or genera names of all the proposed contaminants you detected from your meta-analysis (contaminants.yml)? Regards

pirovc commented 1 year ago

Glad it found a happy user :)

Sure, follow the tab-separeted file (taxid, genus name, species name). contaminants_genus_species_names.txt

The script used to generate, in case you need to tweak it:

#!/usr/bin/env python3

# pip install pyyaml multitax

import yaml
from multitax import NcbiTx

infile = "contaminants.yml"
outfile = "contaminants_genus_species_names.txt"

c = yaml.safe_load(open(infile,"r"))
taxids = set([str(id) for c1 in c.values() for c2 in c1.values() for id in c2["ids"]])

outf = open(outfile, "w")
tax = NcbiTx()
for t in taxids:
    print(t, *tax.name_lineage(t, ranks=["genus", "species"]), sep="\t", file=outf)
outf.close()