Hadza identifiers don't match master table: Nepal_MoBio_Fiber-Hadza-Nepal_B_5_THA1056YZ.71
CRISPR spacers to all 3 databases
kmers to only 2/3 databases
MySQL tables
host_taxonomy: only has GTDB r202 (258406) / UHGG v2.0 (289232)
hadza_mags: 54779 and includes mapping to metadata table
host_genomes: this is the table that Bryan provided me for the genomes he's using
uses GTDB r207**
otu_to_host_spacers: contains matches to all 3 databases
otu_to_host_kmers: not updated with Hadza matches
To Do
[x] rerun host prediction using correct input data (Hadza, GTDB-gut, UHGG)
Hadza identifiers don't match master table: Nepal_MoBio_Fiber-Hadza-Nepal_B_5_THA1056YZ.71 CRISPR spacers to all 3 databases kmers to only 2/3 databases
MySQL tables
To Do
Genome info counts (mysql = host_genomes):
host_taxonomy.tsv file: looks good!
Do I have hits to UHGG genomes absent from list?