ashvardanian / SimSIMD

Up to 200x Faster Dot Products & Similarity Metrics — for Python, Rust, C, JS, and Swift, supporting f64, f32, f16 real & complex, i8, and bit vectors using SIMD for both AVX2, AVX-512, NEON, SVE, & SVE2 📐
https://ashvardanian.com/posts/simsimd-faster-scipy/
Apache License 2.0
988 stars 59 forks source link

Python test_intersect fails on an AWS c7g instance (Graviton3) #168

Closed MarkReedZ closed 1 month ago

MarkReedZ commented 2 months ago

Looks like we have a fail on graviton 3. A t4g.small instance is fine, but with c7g the pytest fails on main-dev

================================================================ short test summary info ================================================================
SKIPPED [600] python/test.py:42: SciPy Mahalanobis distance returned NaN due to `sqrt` of a negative number
SKIPPED [300] python/test.py:388: Problems inferring the tolerance bounds for numerical errors
FAILED python/test.py::test_intersect[10-100-uint32-2-100] - AssertionError: Missing [  6  26  40  41  51  54  73  76  98 119 153 167 189] from [  3   6   7   3   7  19  21  23  25  26  25  30  31  34  35  36 ...
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
=============================================== 1 failed, 3918 passed, 900 skipped, 193 warnings in 4.84s ===============================================

The setup script:

sudo apt-get update && sudo apt-get install cmake build-essential libjemalloc-dev g++-13 gcc-13 -y
git clone https://github.com/ashvardanian/SimSIMD.git
cd SimSIMD/
git checkout main-dev
sudo apt install python3.12-venv python3-dev -y
source ~/env/bin/activate
pip install pytest pytest-repeat setuptools numpy scipy
python setup.py install --force
pytest python/test.py -s -x -Wd

You can skip the test and see the rest pass with pytest python/test.py -s -x -Wd -k "not test_intersect"

ashvardanian commented 2 months ago

Yes, that's most likely an algorithmic issues. I am actively working on a much improved set intersection algorithm (https://github.com/ashvardanian/SimSIMD/commit/aba39868075e3dffca0daecd2ebb0f9528feea1a and in subsequent commits) and would postpone resolving this specific issue until we merge the new solution 🤗

For now we can just comment it as experimental and redirect from Arm to serial code every time. What do you think, @MarkReedZ?

MarkReedZ commented 2 months ago

Are we okay just leaving this as-is and using -k "not test_intersect" ? Assuming no one else will bump into this until the new code is in.

ashvardanian commented 2 months ago

For now we can just comment it as experimental and redirect from Arm to serial code every time.

This way no -k is needed 🤗

ashvardanian commented 1 month ago

@MarkReedZ, I believe this is resolved now, since the new algorithms were merged in v5.3.