Open breezewish opened 1 week ago
Have you tried the 7g instances? I generally avoid implementing f32 kernels in SimSIMD, but they should be very easy to add for parity, in case you want to contribute š¤
@ashvardanian Thank you! I will have a try for 7g. Just wondering currently which kernels are more optimized?
I'd recommend trying f16 and i8. The f32 should be very easy to add.
@ashvardanian Thank you for the recommendation. I revisited SimSIMD and found out that f32 NEON and f32 SVE implementations are already available for l2sq and cosine distance, and the implementations also look good to me. What kind of further works could be done for further improvements? I could have a try :)
Interesting š¤· Not sure if inlining or something else can explain the duration difference in this case.
Describe the bug
Not sure whether this should be a bug, I'm using usearch to insert 60000 vectors Ć 784d from the Fashion-MNIST dataset.
And here are some interesting finding with index build time:
Compiler flags:
-march=native
.Steps to reproduce
USEARCH_USE_SIMSIMD could be changed in
USearch.h
.The index is build as follows. For more details, please refer to https://github.com/breezewish/usearch-bench:
Expected behavior
SIMSIMD should be always faster? Not sure whether we could have some inspirations from the compiled result of M1 Pro.
USearch version
2.12.0
Operating System
MacOS
Hardware architecture
Arm
Which interface are you using?
C++ implementation
Contact Details
No response
Are you open to being tagged as a contributor?
.git
history as a contributorIs there an existing issue for this?
Code of Conduct