unum-cloud / usearch

Fast Open-Source Search & Clustering engine × for Vectors & 🔜 Strings × in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram 🔍
https://unum-cloud.github.io/usearch/
Apache License 2.0
2.24k stars 137 forks source link

Bug: `PersistAndRestore` C# test fails on MacOS #454

Open ashvardanian opened 3 months ago

ashvardanian commented 3 months ago

Describe the bug

Weirdly, all C# tests pass on most platforms, but one of them fails on MacOS, and I don't know where it's coming from.

Steps to reproduce

On macOS with Arm-based chips:

mkdir -p "csharp/lib/runtimes/osx-arm64/native"
cp "build_artifacts/libusearch_c.dylib" "csharp/lib/runtimes/osx-arm64/native"
cd csharp
dotnet test -c Debug --logger "console;verbosity=detailed"

This fails. But excluding that one test fixes things.

dotnet test -c Debug --logger "console;verbosity=detailed" --filter "FullyQualifiedName!=Cloud.Unum.USearch.Tests.UsearchIndexTests.PersistAndRestore"

Expected behavior

Tests passing

USearch version

33c997e3b5f2ef7f8e28942961576d1cf52bc3a5

Operating System

macOS Sonoma

Hardware architecture

Arm

Which interface are you using?

Other bindings

Contact Details

No response

Are you open to being tagged as a contributor?

Is there an existing issue for this?

Code of Conduct

brittlewis12 commented 3 months ago

was able to get a .NET7 env locally.

can easily reproduce on the indicated SHA:

dotnet test -c Debug --logger "console;verbosity=detailed" --filter "FullyQualifiedName=Cloud.Unum.USearch.Tests.UsearchIndexTests.PersistAndRestore" > csharp-debug-macos-arm64.txt 
The active test run was aborted. Reason: Test host process crashed
Test Run Aborted.

it would appear that something about the cleanup in the finally block (and the accesses on the out arrays), causes the crash. here’s the full isolated output w a ton of dubug statements sprinkled throughout the test and the search method.

Full test log output ⬇️ ```c# Determining projects to restore... All projects are up-to-date for restore. Cloud.Unum.USearch -> /Users/tito/code/usearch/csharp/src/Cloud.Unum.USearch/bin/Debug/netstandard2.0/Cloud.Unum.USearch.dll Cloud.Unum.USearch.Tests -> /Users/tito/code/usearch/csharp/src/Cloud.Unum.USearch.Tests/bin/Debug/net7.0/Cloud.Unum.USearch.Tests.dll NativeLibDir: /Users/tito/code/usearch/csharp/lib/ Files: /Users/tito/code/usearch/csharp/lib/runtimes/osx-arm64/native/libusearch_c.dylib bin/Debug/net7.0/ Test run for /Users/tito/code/usearch/csharp/src/Cloud.Unum.USearch.Tests/bin/Debug/net7.0/Cloud.Unum.USearch.Tests.dll (.NETCoreApp,Version=v7.0) Microsoft (R) Test Execution Command Line Tool Version 17.7.2 (arm64) Copyright (c) Microsoft Corporation. All rights reserved. Starting test execution, please wait... A total of 1 test files matched the specified pattern. /Users/tito/code/usearch/csharp/src/Cloud.Unum.USearch.Tests/bin/Debug/net7.0/Cloud.Unum.USearch.Tests.dll [xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.5+1caef2f33e (64-bit .NET 7.0.20) [xUnit.net 00:00:00.17] Discovering: Cloud.Unum.USearch.Tests [xUnit.net 00:00:00.19] Discovered: Cloud.Unum.USearch.Tests [xUnit.net 00:00:00.19] Starting: Cloud.Unum.USearch.Tests Building index... Built index Adding vector System.Single[] to index.... Added vector System.Single[] to index Saving index to path /Users/tito/code/usearch/csharp/src/Cloud.Unum.USearch.Tests/bin/Debug/net7.0/savedVectorFolder/tmp.usearch.... Saved index to path /Users/tito/code/usearch/csharp/src/Cloud.Unum.USearch.Tests/bin/Debug/net7.0/savedVectorFolder/tmp.usearch Asserting file exists at /Users/tito/code/usearch/csharp/src/Cloud.Unum.USearch.Tests/bin/Debug/net7.0/savedVectorFolder/tmp.usearch Asserting temp file exists at dir /Users/tito/code/usearch/csharp/src/Cloud.Unum.USearch.Tests/bin/Debug/net7.0/savedVectorFolder Restoring index from /Users/tito/code/usearch/csharp/src/Cloud.Unum.USearch.Tests/bin/Debug/net7.0/savedVectorFolder/tmp.usearch... Asserting restored index size... Index assert succeeded, contains one vector Searching index... Initializing keys and distances arrays... Keys and distances arrays initialized. Pinning handles to query, keys, and distances... Handles to query, keys, and distances pinned. With pointers to pinned handles, calling native search... Native search returned matches: 1 Native search error pointer: 0 Native search keys pointer: System.UInt64[] Native search distances pointer: System.Single[] Matches less than count, resizing keys&distances... 1 < 10 Resized keys and distances to match 1 Search OK Freeing handles... Freed handles. Got matches: 1, keys: System.UInt64[] Cleaning up... ```

is it possible it’s a race related to how the memory for those inout arrays is freed after it’s pinned?

i don’t know much about c# semantics, but when I comment out the handles free calls in the finally block, it passes about 1/5 of the time?! it still mostly fails tho.

Test log success, no frees in finally block ``` Determining projects to restore... All projects are up-to-date for restore. Cloud.Unum.USearch -> /Users/tito/code/usearch/csharp/src/Cloud.Unum.USearch/bin/Debug/netstandard2.0/Cloud.Unum.USearch.dll Cloud.Unum.USearch.Tests -> /Users/tito/code/usearch/csharp/src/Cloud.Unum.USearch.Tests/bin/Debug/net7.0/Cloud.Unum.USearch.Tests.dll NativeLibDir: /Users/tito/code/usearch/csharp/lib/ Files: /Users/tito/code/usearch/csharp/lib/runtimes/osx-arm64/native/libusearch_c.dylib bin/Debug/net7.0/ Test run for /Users/tito/code/usearch/csharp/src/Cloud.Unum.USearch.Tests/bin/Debug/net7.0/Cloud.Unum.USearch.Tests.dll (.NETCoreApp,Version=v7.0) Microsoft (R) Test Execution Command Line Tool Version 17.7.2 (arm64) Copyright (c) Microsoft Corporation. All rights reserved. Starting test execution, please wait... A total of 1 test files matched the specified pattern. /Users/tito/code/usearch/csharp/src/Cloud.Unum.USearch.Tests/bin/Debug/net7.0/Cloud.Unum.USearch.Tests.dll [xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.5+1caef2f33e (64-bit .NET 7.0.20) [xUnit.net 00:00:00.18] Discovering: Cloud.Unum.USearch.Tests [xUnit.net 00:00:00.19] Discovered: Cloud.Unum.USearch.Tests [xUnit.net 00:00:00.19] Starting: Cloud.Unum.USearch.Tests Building index... Built index Adding vector System.Single[] to index.... Added vector System.Single[] to index Saving index to path /Users/tito/code/usearch/csharp/src/Cloud.Unum.USearch.Tests/bin/Debug/net7.0/savedVectorFolder/tmp.usearch.... Saved index to path /Users/tito/code/usearch/csharp/src/Cloud.Unum.USearch.Tests/bin/Debug/net7.0/savedVectorFolder/tmp.usearch Asserting file exists at /Users/tito/code/usearch/csharp/src/Cloud.Unum.USearch.Tests/bin/Debug/net7.0/savedVectorFolder/tmp.usearch Asserting temp file exists at dir /Users/tito/code/usearch/csharp/src/Cloud.Unum.USearch.Tests/bin/Debug/net7.0/savedVectorFolder Restoring index from /Users/tito/code/usearch/csharp/src/Cloud.Unum.USearch.Tests/bin/Debug/net7.0/savedVectorFolder/tmp.usearch... Asserting restored index size... Index assert succeeded, contains one vector Searching index... Initializing keys and distances arrays... Keys and distances arrays initialized. Pinning handles to query, keys, and distances... Handles to query, keys, and distances pinned. With pointers to pinned handles, calling native search... Native search returned matches: 1 Native search error pointer: 0 Native search keys pointer: System.UInt64[] Native search distances pointer: System.Single[] Matches less than count, resizing keys&distances... 1 < 10 Resized keys and distances to match 1 Search OK Got matches: 1, keys: System.UInt64[] Cleaning up... Cleaned up tmp saved index. [xUnit.net 00:00:00.42] Finished: Cloud.Unum.USearch.Tests Passed Cloud.Unum.USearch.Tests.UsearchIndexTests.PersistAndRestore [203 ms] Test Run Successful. Total tests: 1 Passed: 1 Total time: 0.7803 Seconds ```

Happy to spelunk further if this gives you any useful info!

brittlewis12 commented 3 months ago

think I found it:

index initialized as f64, seemingly intentionally by the comment which says "don't quantize": https://github.com/unum-cloud/usearch/blob/75eb7736158e2d8001481f1fb612c2697482ff8b/csharp/src/Cloud.Unum.USearch.Tests/USearchIndexTests.cs#L57

but the vector (as indexed and queried) is a float/f32, rather than a double/f64: https://github.com/unum-cloud/usearch/blob/75eb7736158e2d8001481f1fb612c2697482ff8b/csharp/src/Cloud.Unum.USearch.Tests/USearchIndexTests.cs#L64

this should do it:

diff --git a/csharp/src/Cloud.Unum.USearch.Tests/USearchIndexTests.cs b/csharp/src/Cloud.Unum.USearch.Tests/USearchIndexTests.cs
index f69b9e5..508b119 100644
--- a/csharp/src/Cloud.Unum.USearch.Tests/USearchIndexTests.cs
+++ b/csharp/src/Cloud.Unum.USearch.Tests/USearchIndexTests.cs
@@ -61,7 +61,7 @@ public class UsearchIndexTests
             expansionSearch: 19 // Control the quality of search, optional
         );

-        var vector = new float[] { 0.2f, 0.6f, 0.4f };
+        var vector = new double[] { 0.2f, 0.6f, 0.4f };
         index.Add(42, vector);

I'll put up a PR w this patch and unfiltering the test here in a few.

ashvardanian commented 3 months ago

Interesting. I believe using double is a good idea, but in either case the process shouldn't crash... I'd simply expect the following assertion to fail. So there may be a problem inferring the type of vector, assuming a larger buffer, writing out-of-bounds... and finally crashing. Can it be the case?

brittlewis12 commented 3 months ago

yes something to that effect sounds likely — is it necessary the query vector align w the index precision type? and should add check/enforce the precision isn’t less than the index precision?

quantization is one thing, so more precise must be allowed but an improperly configured index silently allowing less precise data seems like a potential footgun. perhaps it could cast the input vector to the appropriate type given the index’s expected precision as a middle ground?

ashvardanian commented 3 months ago

is it necessary the query vector align w the index precision type?

No, those are independent variables.

should add check/enforce the precision isn’t less than the index precision?

No need for that.

perhaps it could cast the input vector to the appropriate type given the index’s expected precision as a middle ground?

It performs the needed casts under the hood already.


What's weird - this only appears on MacOS... and only with C# bindings/tests.

brittlewis12 commented 3 months ago

Are there tests that exercise a similar code path on other platforms or for other languages? It seems like a mistake, rather than an intentional test case designed to pass a smaller than expected value.

But agreed it ideally shouldn’t crash regardless.

ashvardanian commented 3 months ago

Not exactly the same test, but it should be easy to add in test.cpp using the test_punned_add_remove_vector function on the main-dev branch as a reference.