Closed jrcavani closed 3 months ago
Hi @jrcavani! That is definitely a bug.
The warning isn't telling us much, especially as it redefines the macro with the same value. Can you also report the lscpu
outputs, to make sure your CPU supports all the right features?
The macro that got overwritten was originally specified to be SIMSIMD_DYNAMIC_DISPATCH=1
if cfg!(feature = "simsimd") {
build
.define("USEARCH_USE_SIMSIMD", "1")
.define("SIMSIMD_DYNAMIC_DISPATCH", "1")
.define("SIMSIMD_NATIVE_F16", "0");
} else {
But it was redefined to be 1
warning: usearch@2.12.0: 52 | #define SIMSIMD_DYNAMIC_DISPATCH 0
Again, I haven't looked closely enough to know which one was an env var, and which one was a compile time constant, and how they interact between build.rs
and C++ code.
Here is the lscpu
output:
--> lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 64
On-line CPU(s) list: 0-63
Vendor ID: GenuineIntel
Model name: Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz
CPU family: 6
Model: 106
Thread(s) per core: 2
Core(s) per socket: 32
Socket(s): 1
Stepping: 6
BogoMIPS: 5799.96
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology nonst
op_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnow
prefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni
avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves wbnoinvd ida arat avx512vbmi pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg tme avx512_vpopcntdq rdpid md_clear flus
h_l1d arch_capabilities
Virtualization features:
Hypervisor vendor: KVM
Virtualization type: full
Caches (sum of all):
L1d: 1.5 MiB (32 instances)
L1i: 1 MiB (32 instances)
L2: 40 MiB (32 instances)
L3: 54 MiB (1 instance)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-63
Vulnerabilities:
Itlb multihit: Not affected
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Mmio stale data: Mitigation; Clear CPU buffers; SMT Host state unknown
Retbleed: Not affected
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; Enhanced IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence
Srbds: Not affected
Tsx async abort: Not affected
Hey @jrcavani, is this still the case?
Yes. Just upgraded both simsimd and usearch:
usearch = { version = "2.13.2", features = ["simsimd", "fp16lib"] }
simsimd = "5.0.0"
[2024-08-13T04:55:48Z INFO test_usearch] Hardware acceleration: serial
simsimd prints expected flags:
use simsimd::capabilities;
use simsimd::ComplexProducts;
use simsimd::SpatialSimilarity;
fn main() {
let vector_a: Vec<f32> = vec![1.0, 2.0, 3.0, 4.0];
let vector_b: Vec<f32> = vec![5.0, 6.0, 7.0, 8.0];
// Compute the inner product between vector_a and vector_b
let inner_product =
SpatialSimilarity::dot(&vector_a, &vector_b).expect("Vectors must be of the same length");
println!("Inner Product: {}", inner_product);
// Compute the complex inner product between complex_vector_a and complex_vector_b
let complex_inner_product =
ComplexProducts::dot(&vector_a, &vector_b).expect("Vectors must be of the same length");
let complex_conjugate_inner_product =
ComplexProducts::vdot(&vector_a, &vector_b).expect("Vectors must be of the same length");
println!("Complex Inner Product: {:?}", complex_inner_product); // -18, 69
println!(
"Complex C. Inner Product: {:?}",
complex_conjugate_inner_product
); // 70, -8
println!("uses neon: {}", capabilities::uses_neon());
println!("uses sve: {}", capabilities::uses_sve());
println!("uses haswell: {}", capabilities::uses_haswell());
println!("uses skylake: {}", capabilities::uses_skylake());
println!("uses ice: {}", capabilities::uses_ice());
println!("uses sapphire: {}", capabilities::uses_sapphire());
}
Inner Product: 70
Complex Inner Product: (-18.0, 68.0)
Complex C. Inner Product: (70.0, -8.0)
uses neon: false
uses sve: false
uses haswell: true
uses skylake: true
uses ice: true
uses sapphire: false
So weird, the Python version prints everything correctly, so it shouldn't be coming from the core implementation:
python -c 'from usearch.index import Index; print(Index(ndim=768, metric="cos", dtype="f16").hardware_acceleration)'
I am playing around the build.rs
, but don't see where the issue is coming from.
It did work!!
[2024-08-18T03:24:04Z INFO test_usearch] Hardware acceleration: skylake
Describe the bug
I am getting
serial
as the acceleration.If I installed in Python through pip, it's good:
I've been able to find the relevant build code paths:
This is the line that prints
serial
: https://github.com/unum-cloud/usearch/blob/5ea48c87c56a25ab57634a8f207f80ae675ed58e/include/usearch/index_plugins.hpp#L1492This is the line that decides including an env var
USEARCH_USE_SIMSIMD
inbuild.rs
whensimsimd
feature is turned on: https://github.com/unum-cloud/usearch/blob/5ea48c87c56a25ab57634a8f207f80ae675ed58e/build.rs#L28I was able to ensure the build script runs that code block (by writing some log file to disk in
build.rs
), but I am not seeing any change to the callindex.hardware_acceleration()
.I tried to built
usearch
manually:Does this mean something?
Steps to reproduce
This is the feature list in
Cargo.toml
:This is the source code for the test:
I am getting serial as the acceleration. Is this right? This applies to
f32
,f16
andi8
.Expected behavior
SIMD acceleration is expected.
USearch version
2.12.0
Operating System
Ubuntu 22.04
Hardware architecture
x86
Which interface are you using?
Other bindings
Contact Details
No response
Are you open to being tagged as a contributor?
.git
history as a contributorIs there an existing issue for this?
Code of Conduct