sourmash-bio / sourmash

Quickly search, compare, and analyze genomic and metagenomic data sets.
http://sourmash.readthedocs.io/en/latest/
Other
473 stars 80 forks source link

fast clustering of many large sketches - kspider #2271

Open ctb opened 2 years ago

ctb commented 2 years ago

@mr-eyes has been working steadily on using kspider (docs and repo) to cluster many large collections of k-mers, and has achieved some impressive results.

This issue is b/c I wanted to link some of the kSpider work into this repo so that it was discoverable by sourmash aficionados!

@mr-eyes if you have a tutorial or some guidance for people wanting to try out kSpider with sourmash sketches, please point to it here!

mr-eyes commented 2 years ago

I will be working on updating the docs to include the latest updates of kSpider dev and will add some tutorials on how to run it on sourmash sigs. Will update this issue when I am done.