-
Index sorting is what makes LogsDB so storage efficient. The better the configuration for sort fields, the more storage can be saved by efficiently encoding metadata fields that are the same for a set…
-
Hi,
This is a very good library and implementation of LSH. Thanks for the contribution.
Sorry, I am new to this library. May I ask is there a way to return the hashcode from:
lsh = LSHash(6, …
-
Hi !
Our reference database (protozoa) 3.9Gb size was successfully converted from fasta to Seq (tabular) format:
1 protozoa.genomic.fasta CTGACTAAGCATCCCTCTTAAAAGTCGAGGCTAACCCTAACCCTAACC…
-
Transferred from http://code.opencv.org/issues/3773
```
|| Liu Xiao on 2014-06-27 13:16
|| Priority: Normal
|| Affected: 2.4.9 (latest release)
|| Category: flann
|| Tracker: Bug
|| Difficulty:
|| P…
-
Hey,
thank you in advance for your great work and sharing the data :)
I read README and huggingface details and was unclear whether fuzzy deduplication is actually done on this dataset.
I underst…
-
### Description
When attempting to train a Reformer with LSH Attention with n_hashes > 1 on a TPU, training will get stuck, and the trainer is not able to complete even a single training step.
#…
-
The SDS Technical Reference manuals say that if the shift count is > 48, it is set to 48. The LSH and RSH implementations in ```sds_cpu.c``` do that for logical, arithmetic shifts, and normalize. For …
-
hm2.cpp:1469:8: warning: address of local variable ‘c’ returned [-Wreturn-local-addr]
float c[3];
^
In file included from /usr/local/include/flann/util/matrix.h:35:0,
fr…
-
Hi, I have a question about large-scale LSH index. If I have billions of documents, I suppose even 1T RAM is not enough to do in-memory LSH, is there any recommended way to use datasketch for this sce…
-
https://ansvver.github.io/lsh_simhash.html