sourmash-bio / sourmash

Quickly search, compare, and analyze genomic and metagenomic data sets.
http://sourmash.readthedocs.io/en/latest/
Other
475 stars 80 forks source link

new GTDB databases for rs220: draft #3246

Open ctb opened 3 months ago

ctb commented 3 months ago

Note: not yet final

release rs220 from GTDB: https://data.ace.uq.edu.au/public/gtdb/data/releases/release220/

sourmash databases constructed by @ccbaumler 🎉 using directsketch -

Note: The databases are missing approximately 200 genomes that have been deprecated/suspended by GenBank.

Questions/thoughts

Related issues:

nmb85 commented 2 months ago

Are there sketches of a reduced genome-representatives version of the GTDB rs220 release? Not complaining if not. Thank you, @ccbaumler, for what you've provided here!

ccbaumler commented 2 months ago

Yup! Representative genomes have been sketched. I will be meeting with @ctb next week to get everything set up.