unum-cloud / usearch

Fast Open-Source Search & Clustering engine × for Vectors & 🔜 Strings × in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram 🔍
https://unum-cloud.github.io/usearch/
Apache License 2.0
1.92k stars 109 forks source link

Bug: Non of methods are working in Multi-threaded envirionment #441

Open etempm opened 4 weeks ago

etempm commented 4 weeks ago

Describe the bug

I tried each of Add, Remove, Save and Search methods independently from each others with multiple threads. All of them are failed. Memory corruption error raised.

Steps to reproduce

You can run any of this methods in multiple threads to see error.

Expected behavior

At least, thread safe Search method is required. I am using locking to overcome this, but it is very slow.

USearch version

2.12.0

Operating System

Windows 11

Hardware architecture

x86

Which interface are you using?

Other bindings

Contact Details

etempm@gmail.com

Are you open to being tagged as a contributor?

Is there an existing issue for this?

Code of Conduct

ashvardanian commented 4 weeks ago

You are probably doing something wrong, but I can't guess without code 🤗

etempm commented 4 weeks ago

Please run this code to see error. Without lock, it throw Access violation exception.

` private async void Test() { USearchIndex UIndex = new USearchIndex( metricKind: MetricKind.Cos, // Choose cosine metric quantization: ScalarKind.Float32, // Only quantization to Float32, Float64 is currently supported dimensions: 512, // Define the number of dimensions in input vectors connectivity: 16, // How frequent should the connections in the graph be, optional expansionAdd: 128, // Control the recall of indexing, optional expansionSearch: 64 // Control the quality of search, optional ); //UIndex.IncreaseCapacity(102400); //-> THIS METHOD IS PRIVATE, SO I CAN NOT ACCESS IT.

var TaskAdd = Task.Run(() =>
{
    Parallel.For(0L, 102400L, i =>
    {
        float[] tmpVector = new float[512];
        for (int j = 0; j < 512; j++)
        {
            tmpVector[j] = Random.Shared.NextSingle();
        }
        //Monitor.Enter(UIndex);  //--> if I enter lock here, no problems.
        UIndex.Add((ulong)i, tmpVector); //--> Throws access vialotion error with out lock.
        //Monitor.Exit(UIndex);
    });
});

var TaskSearch = Task.Run(() =>
{
    float[] SearchVector = Enumerable.Range(0, 512).Select(n => 0.001f * n).ToArray();
    long Result = 0;
    ulong[] Keys;
    float[] Distances;
    float[] tmpVector = new float[512];
    ulong Start = UIndex.Capacity();
    Parallel.For(0L, 102400L, i =>
    {
        //Monitor.Enter(UIndex);   //--> if I enter lock here, no problems.
        Result = UIndex.Search(SearchVector, 5, out Keys, out Distances); //--> Throws access vialotion error with out lock.
        //Monitor.Exit(UIndex);                                                           //cacheLock.ExitWriteLock();
    });
});
await Task.WhenAny(TaskAdd, TaskSearch);

} `

ashvardanian commented 4 weeks ago

@etempm, can you limit the number of threads to a value smaller or equal to the number of physical cores on your machine? I suspect that's the issue.

etempm commented 4 weeks ago

Yes, you are right!. I limited thread count to Environment.ProcessorCount and it worked like a charm. Thank you very much!!!

etempm commented 4 weeks ago

Just for clarification. I will use usearch for real time face recognition over 100 ip cameras. Each camera working on own thread. With the 16 core cpu, 100 threads, real time. I can not limit thread count in thar situation. Will this be same problem? What is you opinion?

ashvardanian commented 4 weeks ago

For that we need to extend the "reserve" interface in C#, to allow optional second argument for the number of thread-specific "contexts" it can use. Then everything would work fine. Can you try patching the library to do that?

etempm commented 4 weeks ago

If i can understand what you mean, i wil be glad to do that. I see extend method in native class which is increase capacity in c#. But i did not understand "thread specific context". C# code just calling your dll. Is your reserve method has that "context" parameter already? Can you tell more details .

etempm commented 4 weeks ago

Ok, i am looking c code now. As i can see, a lot of functionality is missing on c# side. I can add that functions to c# wrapper. If you can add thread specific context parameter also, i will patch it too. Thanks for great support again.

ashvardanian commented 4 weeks ago

So the C++ layer has a "reserve" function, which returns a structure - not just an integer. In that structure you can inform the number of threads that will use the index. For every one of those I pre-allocate queues and other buffers to avoid doing that during concurrent calls. That functionality is what we want to expose to C# users 🤗

Thanks to you too!

etempm commented 3 weeks ago

I think you mean

struct index_limits_t { std::size_t members = 0; std::size_t threads_add = std::thread::hardware_concurrency(); std::size_t threads_search = std::thread::hardware_concurrency();

inline index_limits_t(std::size_t n, std::size_t t) noexcept : members(n), threads_add(t), threads_search(t) {}
inline index_limits_t(std::size_t n = 0) noexcept : index_limits_t(n, std::thread::hardware_concurrency()) {}
inline std::size_t threads() const noexcept { return (std::max)(threads_add, threads_search); }
inline std::size_t concurrency() const noexcept { return (std::min)(threads_add, threads_search); }

};

etempm commented 3 weeks ago

Hi, sory, I am not cpp guy. Trying to call function. I defined struct as:

[StructLayout(LayoutKind.Sequential)] public struct IndexLimitsT { public ulong Members; public ulong ThreadsAdd; public ulong ThreadsSearch; }

I added this function to NativeMethods

[DllImport(LibraryName, CallingConvention = CallingConvention.Cdecl)] public static extern void usearch_reserve(usearch_index_t index, IndexLimitsT limits, out usearch_error_t error);

and called the method like this:

IndexLimitsT limits2 = new IndexLimitsT { Members = 10, ThreadsAdd = 10, ThreadsSearch = 10, }; NativeMethods.usearch_reserve(this._index, limits2, out IntPtr error); HandleError(error);

bu no luck. Any idea?