-
I seem to be getting an error when trying to call one of the tokenizer functions from the [tokenizers](https://github.com/ropensci/tokenizers) package. Any idea why this might be going on?
``` r
l…
-
As far as I know, there is only one another package for LSH in R - [textreuse](https://github.com/ropensci/textreuse). It is well documented and tested, but a little bit slower (about 5x):
``` r
# de…
-
In addition to these failure, two important requirements are not declared: khmer and sphynx.
doc/api-example.rst .
sourmash_lib/__init__.py ..
sourmash_lib/logging.py .......
sourmash_lib/test_…
-
`test_estimators.py::test_pickle` only checks the values on the Estimator object, and does not confirm that the new `Estimator.mh` object behaves properly; we should check all the behavior, including …
-
see https://github.com/dib-lab/sourmash/pull/83#discussion_r94702545
-
At the moment we store:
1. entire "permitation" hash matrix for **minhashing**
2. entire random projections for **sketching**
We should not store this matrices at all - do hashing on the fly and **ke…
-
specify arguments and then transform gb record to json
```
zoo in --ncbi "txid64320[Organism:noexp]" \
--client "localhost:27017" \
--db "flavi" \
--collection "zika"
zoo in --j…
-
There is a bug in the serialization / deserialization - hashobj is not stored, so it will always revert back to sha1 (the __init__ default argument) when the class is deserialized.
ekzhu updated
7 years ago
-
Hi,
Very nice package!
Would it be possible to re-use the LSH/minhash functionalities for a different use case? Specifically, is there a way to use it if I have a series of binary vectors and I…
-
Hello, I'm really liking this library, but have recently started running into RAM limitations trying to handle billions of MinhashLSH index members on a 16GB quadcore OSX.
Right now, I'm essentiall…