Closed kmkolasinski closed 3 months ago
Now, I think the problem is somewhere else, inside simsimd package
import simsimd
vf16 = np.array([5.0, 5.0, 5.0], dtype=np.float16)
vi8 = np.array([5, 5, 5], dtype=np.int8)
np.array(simsimd.dot(vf16, vf16)), np.array(simsimd.dot(vi8, vi8))
>> (array(75.), array(1.78813934e-07))
another place which is probably causing confusion
#if SIMSIMD_TARGET_NEON
if (viable & simsimd_cap_neon_k)
switch (kind) {
case simsimd_metric_dot_k: *m = (m_t)&simsimd_cos_i8_neon, *c = simsimd_cap_neon_k; return;
case simsimd_metric_cos_k: *m = (m_t)&simsimd_cos_i8_neon, *c = simsimd_cap_neon_k; return;
case simsimd_metric_l2sq_k: *m = (m_t)&simsimd_l2sq_i8_neon, *c = simsimd_cap_neon_k; return;
default: break;
}
#endif
#if SIMSIMD_TARGET_ICE
if (viable & simsimd_cap_ice_k)
switch (kind) {
case simsimd_metric_dot_k: *m = (m_t)&simsimd_cos_i8_ice, *c = simsimd_cap_ice_k; return;
case simsimd_metric_cos_k: *m = (m_t)&simsimd_cos_i8_ice, *c = simsimd_cap_ice_k; return;
case simsimd_metric_l2sq_k: *m = (m_t)&simsimd_l2sq_i8_ice, *c = simsimd_cap_ice_k; return;
default: break;
}
#endif
#if SIMSIMD_TARGET_HASWELL
if (viable & simsimd_cap_haswell_k)
switch (kind) {
case simsimd_metric_dot_k: *m = (m_t)&simsimd_cos_i8_haswell, *c = simsimd_cap_haswell_k; return;
case simsimd_metric_cos_k: *m = (m_t)&simsimd_cos_i8_haswell, *c = simsimd_cap_haswell_k; return;
case simsimd_metric_l2sq_k: *m = (m_t)&simsimd_l2sq_i8_haswell, *c = simsimd_cap_haswell_k; return;
default: break;
}
#endif
if (viable & simsimd_cap_serial_k)
switch (kind) {
case simsimd_metric_dot_k: *m = (m_t)&simsimd_cos_i8_serial, *c = simsimd_cap_serial_k; return;
case simsimd_metric_cos_k: *m = (m_t)&simsimd_cos_i8_serial, *c = simsimd_cap_serial_k; return;
case simsimd_metric_l2sq_k: *m = (m_t)&simsimd_l2sq_i8_serial, *c = simsimd_cap_serial_k; return;
default: break;
}
break;
cos
is used for dot
?
Yes, that was historically the case, but that should probably be changed. I was planning to look into it next week. It may require new kernels, but those should be easy to design by stripping the cosine implementations.
dot - is not a strict distance metric in math terms, hence probably the confusion in various places in the code
For int8, I see index.hardware_acceleration == "ice"
and there is still something which I don't fully understand. I tested my changes locally by installing usearch from source code, by running
pip install .
which definitely compiles the usearch code. Maybe the code is already fixed in the main branch as compared to usearch==2.12.0 ?
Nope to my question above. When running on main branch I get this then switch to this branch and reinstall usearch
note also speed change for both runs, is it metric_ip_gt
is used only for graph construction ?
@ashvardanian do you know how it is possible ? I still don't undestand how the kernels are being picked, but it looks like simsimd should be used here, so my changes should not affect the search.
SimSIMD should be used in most Linux builds. But you are right to note, that there is a lot of stuff on the main-dev that hasn't been merged yet 🤗
hi, this probably can be closed, right ?
It will be automatically closed, as we merge the branch 🤗
ok, thanks !
Hi @ashvardanian I assume I can close this PR, correct ?
This PR is related to this thread https://github.com/unum-cloud/usearch/issues/405
It proposed a simple fix for Inner Product distance definition