-
**Describe the feature and the current behavior/state.**
当前使用sts,输入两个句子,对于大量句子比较,效率太低,虽然可以batch来做,但效率还是不够
**Will this change the current api? How?**
可以在sts里增加一个输出
**Who will benefit with this …
-
The HTTPS and IPv6 simhash test both use one IPv4 address and one IPv6 address. Some websites have more than one IPv4 and/or IPv6 address. We should make it possible to test all IP addresses (similar …
baknu updated
2 months ago
-
can u update the package in pypi to fix #5 ?
thx :)
pip install python-hashes
py -3 -c "from hashes.simhash import simhash"
ImportError: No module named 'hashtype'
-
hi there,
when I use minhash with lsh or simhash, it's hard to remove short text. anybody could provide some useful method to solve this problem, thanks a ton!
take below example, and dive…
-
Although the code and paper suggest that 64-bit hashes are being used, the Java Object.hashCode() function only returns 32 bits. The good news is that the bug in #19 has no effect since the upper 16-b…
-
I custom it by like below:
`
ans = PriorityQueue()
for key in self.get_keys(simhash):
dups = self.bucket[key]
self.log.debug('key:%s', key)
…
-
Any ideas here?
TypeError: logger.setLevel is not a function
at Object. (D:\source\simhash\node_modules\natural\lib\natural\brill_pos_tagger\lib\Brill_POS_Tagger.js:26:8)
at Module._comp…
-
maybe db code should be updated?
or i should a older version of simhash-py.
-
Does not support the Chinese ?
-
重新规划新的 API,让大家用起来方便一些。下面是一些想法:
1、分离 Cppjieba 中的分词,关键词提取,Simhash 的方法为小的模块,不相互依赖。Cppjieba 5.0 增加了 Textrank 的模块,现有的接口想把这个模块整合起来,使用起来感觉可能会不方便。
在原有的 Cppjieba 的代码中,关键词提取和Simhash 的步骤是包含了分词步骤的,而这两个步骤其实可…
qinwf updated
7 years ago