emilhaegglund / TADA

A Snakemake-workflow to sample taxa from sequence databases based on taxonomical or phylogenetic information
MIT License
8 stars 2 forks source link

Support for GTDB v220 #6

Open lguy opened 4 months ago

lguy commented 4 months ago

Support for the latest release of GTDB would be nice to have. I've experimented with it. So far the hinder is in download_gtdb_metadata. The file name for the metadata file has changed, going from {domain}_metadata_r{version}.tar.gz to {domain}_metadata_r{version}.tsv.gz. I haven't verified that the files are then similar.

emilhaegglund commented 4 months ago

Thanks for highlighting the new GTDB release, I have started to look at how this could be solved. In 220 they have completeness and contamination estimates from both CheckM and CheckM2.