sourmash-bio / sourmash

Quickly search, compare, and analyze genomic and metagenomic data sets.
http://sourmash.readthedocs.io/en/latest/
Other
464 stars 78 forks source link

classify plant reads in metagenomes #3172

Open gabridinosauro opened 3 months ago

gabridinosauro commented 3 months ago

Dear Sourmash team,

Hope you are all good. I have a project where I have some shotgun metagenomics data of wild rodents. I want to see if I can classify reads to plant genomes, to have an idea of their diet.

Is it possible to do it with sourmash? I suppose I would have to make my own database as I do not see any databases containing plants already available.

Thanks in advance.

Gabriele

ctb commented 1 month ago

Hi Gabri, sorry for ignoring your issue for so long 😭

Short version - we don't have anything formal for plants, BUT if you can find a listing of all the things you want - maybe an assembly_summary file? - we can put together a recipe for sketching it quickly. Sound good?

gabridinosauro commented 1 month ago

Hi Titus, sounds great. Here attached the list of all plant genomes marked as reference in the genbank, with their accession numbers and taxID. Is that fine or you need any other info?

Thanks a lot again! Gabri

Rerence_genomes_plants_genbank.csv