Classify all the metagenomes. ALL THE METAGENOMES. (Eventually.)
Other
0
stars
1
forks
source link
analysis idea: re-cluster the SRA based on taxonomy profiles or on jaccard similarity to de novo produce "biomes" that inherit labels from majority in cluster #2
Along with mislabelled data, there seems to be a lot of NAs or duplicate labels -- like "seawater metagenome" vs. "marine metagenome." Are these basically the same thing? Can we infer more granular structure than the ScientificNames that are given?
Along with mislabelled data, there seems to be a lot of NAs or duplicate labels -- like "seawater metagenome" vs. "marine metagenome." Are these basically the same thing? Can we infer more granular structure than the ScientificNames that are given?