sourmash-bio / sourmash

Quickly search, compare, and analyze genomic and metagenomic data sets.
http://sourmash.readthedocs.io/en/latest/
Other
455 stars 78 forks source link

MinHash initialization #338

Open luizirber opened 6 years ago

luizirber commented 6 years ago

(issue triggered by https://github.com/dib-lab/sourmash/blob/66461a4665471a5c0a2c5df02f8180c7ecaf5726/tests/test_signature.py#L51-L52)

Some API thinking: should we make n=0 the default, and allow initializing the MinHash with a ksize and one of (max_hash, scaled, n)? It is kind of weird to have to set n=0 when you want scaled or max_hash...

In HLL I allowed changing error_rate and ksize if you didn't add anything to HLL yet, but throw an error after something is inserted. We could set something similar to MinHash, I think.

ctb commented 4 years ago

see proposal in https://github.com/dib-lab/sourmash/issues/999#issuecomment-633707771

ctb commented 4 years ago

I think we should removemax_hash in 4.0, too - see #1303.

ctb commented 3 years ago

punting this to 5.0.