ctb / magsearch

Workflow and config files for searching (very) large public databases with sourmash sketches
GNU Affero General Public License v3.0
3 stars 0 forks source link

use cases #5

Open ctb opened 1 year ago

ctb commented 1 year ago

seeing if a sample is in the database / identifying if sample is there

differential privacy/dbgap search (at level of technical replicates)

biogeography - where might I look

discovering more examples of strains/species of an interesting species/genus

outbreak detection - plants and humans and animals / one health

spillover idea/spillover risk

"finding gut microbes" example writ larger

notification service of new matches

content-based (re)annotation of stuff in the SRA

ctb commented 1 year ago

postprocessing and cleaning MAGs / checking them against all the things

ctb commented 1 year ago

regulatory evaluation: "this organism is/is not widespread"

ctb commented 1 year ago

content based identification of sra data sets - scaled=1m, first 100 hashes, md5sum

or something like that