nmslib / hnswlib

Header-only C++/python library for fast approximate nearest neighbors
https://github.com/nmslib/hnswlib
Apache License 2.0
4.12k stars 609 forks source link

Normalization should not be done in the wrapper #453

Open DavidGOrtega opened 1 year ago

DavidGOrtega commented 1 year ago

Normalization should be done on the C++ side (if any, my vectors come out from my models already normalised). If not every time the C++ code is used by another programming languages (i.e. nodejs) normalization has to be reimplemented

https://github.com/nmslib/hnswlib/blob/359b2ba87358224963986f709e593d799064ace6/python_bindings/bindings.cpp#L273

Proposed solution is to move this within the lib

void normalize_vector(float* data, float* norm_array) {
        float norm = 0.0f;
        for (int i = 0; i < dim; i++)
            norm += data[i] * data[i];
        norm = 1.0f / (sqrtf(norm) + 1e-30f);
        for (int i = 0; i < dim; i++)
            norm_array[i] = data[i] * norm;
    }
yurymalkov commented 1 year ago

Hi @DavidGOrtega, Sure, we can move it. PRs are welcome. Thanks!