-
With #1967, we will now estimate ANI for any scaled sketch comparisons, regardless of sketch size. These estimates may be inaccurate for viruses/small genomes.
context from https://github.com/sourm…
-
when we see a number like "50%" in containment it could be caused by missing content (e.g. low coverage but otherwise exact matches) or strain variation. It seems like we should be able to do a k-mer …
-
Per: https://github.com/ctb/2022-sourmash-sens-spec/blob/main/fracminhash-runs-simulate.ipynb
```
in M=100 k-mers, p of finding at least one hash is: 63.43% - scaled=100
in M=200 k-mers, p of fin…
-
Hello, database docs said:
> The signatures were calculated with a scaled of 1000, which robustly supports searches for ~10kb or larger matches.
>
I want to know how "~10kb" were estimated, and d…
-
(this can be a bit of a running issue until we get around to fixing them ;)
The introduction of ANI output in [v4.4.0](https://github.com/sourmash-bio/sourmash/releases/tag/v4.4.0) is pretty awesom…
-
Hey,
I generated Sourmash signatures (scale 1000) for WGS samples of humans and utilized kSpider for clustering based on these signatures. However, I discovered that the resulting clusters exhibited …
-
* standard moltype, ksize, scaled parsing
* `FracMinHash` template class creation based on args
* loading / selection of databases based on argparse stuff via `sourmash.load_file_as_index`
-
luiz's talk about how we develop/design/evolve sourmash: https://www.youtube.com/watch?v=0jpnP8NtRfc&feature=youtu.be
-
Ref #606.
The reviewers for [Pierce et al., 2019](https://f1000research.com/articles/8-1006) were enthusiastic about some of the algorithmic ideas we hinted at in sourmash (and correctly pointed ou…
-
Over in https://github.com/sourmash-bio/sourmash/pull/1837, I'm discovering some fun challenges with manifests 🎉 .
first, it turns out that manifests do not contain `seed` or `license` (also see ht…