Open luyuncheng opened 3 years ago
Merging #284 (c2abcb2) into master (8c802d9) will increase coverage by
1.62%
. The diff coverage is85.07%
.
@@ Coverage Diff @@
## master #284 +/- ##
============================================
+ Coverage 79.27% 80.89% +1.62%
- Complexity 359 413 +54
============================================
Files 58 62 +4
Lines 1404 1581 +177
Branches 126 147 +21
============================================
+ Hits 1113 1279 +166
- Misses 243 249 +6
- Partials 48 53 +5
Impacted Files | Coverage Δ | Complexity Δ | |
---|---|---|---|
...istroforelasticsearch/knn/index/v206/KNNIndex.java | 74.71% <75.00%> (-2.04%) |
18.00 <7.00> (+10.00) |
:arrow_down: |
...index/codec/KNN80Codec/KNN80DocValuesConsumer.java | 75.00% <83.33%> (+4.11%) |
18.00 <0.00> (+3.00) |
|
...relasticsearch/knn/index/KNNVectorFieldMapper.java | 79.22% <86.66%> (+0.27%) |
17.00 <0.00> (+1.00) |
|
...nn/index/codec/KNN80Codec/KNN80CompoundFormat.java | 96.15% <90.00%> (-3.85%) |
4.00 <1.00> (+1.00) |
:arrow_down: |
...forelasticsearch/knn/index/codec/KNNCodecUtil.java | 87.50% <90.90%> (+4.16%) |
5.00 <3.00> (+3.00) |
|
...opendistroforelasticsearch/knn/index/KNNQuery.java | 81.81% <100.00%> (+9.59%) |
10.00 <2.00> (+3.00) |
|
...pendistroforelasticsearch/knn/index/KNNWeight.java | 97.67% <100.00%> (+0.53%) |
13.00 <3.00> (+3.00) |
|
...endistroforelasticsearch/knn/index/SpaceTypes.java | 100.00% <100.00%> (ø) |
7.00 <2.00> (+2.00) |
|
...earch/knn/plugin/script/KNNWhitelistExtension.java | 100.00% <0.00%> (ø) |
3.00% <0.00%> (?%) |
|
... and 7 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 8c802d9...62455ed. Read the comment docs.
Hi @luyuncheng,
Thanks for the PR! I will take a look some time early next week. My concern about the "non-optimized" indices is that we will end up storing two different files for the HNSW structure in the Lucene index. I need to think about the implications of doing this some more.
Hi @luyuncheng,
Thanks for the PR! I will take a look some time early next week. My concern about the "non-optimized" indices is that we will end up storing two different files for the HNSW structure in the Lucene index. I need to think about the implications of doing this some more.
As i checked the nmslib's code, When we add "non-optimized" data storage, K-NN plugin can support all nmslib's Space Types
@luyuncheng Apologies, have not gotten a chance to look at this yet. Will review in January.
Previously, this PR using string to parse the bit vector, this may occupy too much memory!
As i checked the code: space_bit_hamming.h in nmslib, i think use a int32 array to represent a bit vector is better. so, add modified some code to use vector
This Pr is put on hold. Please refer https://github.com/opendistro-for-elasticsearch/k-NN/issues/283#issuecomment-755598088
*Issue #283,
Description of changes:
As I see #264 that add Hamming distance in custom scoring it is a great functionality. i see there is bit_hamming space space_bit_hamming in nmslib. i think we can add this into plugin.
i refer to the code space_bit_hamming and space_bit_hamming_test, may be we could add "SpaceBitVector" into plugin and support bit_hamming space which is no optimized index.
i also refer to PR: #161 which add no optimized index for "negdotprod", i see the nmslib's python_binding code python_binding_nmslib, may be we could add a "save_data" into plugin and can store index and dataset for "no optimized index".
so, i submit this PR.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.