Open spicymatt opened 4 months ago
Interesting! Do you have a dataset example? I will try to reproduce.
The dataset does not seem to matter. Will try to reproduce it in a sample but here is an example of a blocked code. I have 3 threads waiting for each other in
return __atomic_fetch_or(&slots_[i / bits_per_slot()], mask, __ATOMIC_ACQUIRE) & mask;
of method: inline bool atomic_set(std::size_t i) noexcept {...}
@ashvardanian is this something you can reproduce on your side ? Regards.
Hi @spicymatt! I think I can, but I was working on the C# issues yesterday, and will try to find time to work on Swift today. It would be great if you could open a PR with a minimal test that breaks. It will definitely make things faster for me. Thanks!
Hello I will try but it’s rather hard to break in our tests…it breaks easily in the production app… Thanks a lot Matthieu
On 27 Feb 2024, at 17:31, Ash Vardanian @.***> wrote:
Hi @spicymatt https://github.com/spicymatt! I think I can, but I was working on the C# issues yesterday, and will try to find time to work on Swift today. It would be great if you could open a PR with a minimal test that breaks. It will definitely make things faster for me. Thanks!
— Reply to this email directly, view it on GitHub https://github.com/unum-cloud/usearch/issues/354#issuecomment-1966994667, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIB6RT6XXCKF4RCPNGQQX6TYVYC65AVCNFSM6AAAAABDZH7RGGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRWHE4TINRWG4. You are receiving this because you were mentioned.
Describe the bug
When calling USearchIndex.add() from multiple threads in the Swift concurrency (Swift) a deadlock situation arises in this code:
inline node_lock_t nodelock(std::size_t slot) const noexcept
Threads are stuck in this loop and never return
Steps to reproduce
Use a TaskGroup like in the following example to add the vectors to the array concurrently:
public protocol USearchSignature:Sendable { var us_id:UInt64 {get} var us_vector:[Float] {get} }
var index = USearchIndex ... // used to control how concurrent we are let numberOfConcurentTask = 5
// vectors is an array of USearchSignature public add (vectors:[USearchSignature]) async throws {
var iterator = vectors.makeIterator()
// async code here adds the content in a concurrent manner into the index for _ in 0..<numberOfConcurentTask { if let current = iterator.next() { try Task.checkCancellation() group.addTask{ index.add(key: current.us_id, vector: current.usvector) } } } for try await in group { if let current = iterator.next() { try Task.checkCancellation() group.addTask{ index.add(key: current.us_id, vector: current.us_vector) } } } }
Expected behavior
No deadlock
uSearch is supposed to be thread safe for addition of vectors. It is crucial for us to be able to concurrently add vectors to the index when the system launches since the index building is too slow in single threaded mode. We use a saved index whenever we can...but sometimes we need to reload a 100k vector index (512 floats per vector) efficiently. Multi-threaded loading gives us decent performance. But we cannot afford such deadlocks and we do get them in the current code.
USearch version
v2.9
Operating System
macOS Sonoma
Hardware architecture
Arm
Which interface are you using?
Other bindings
Contact Details
matthieu.kopp@gmail.com
Is there an existing issue for this?
Code of Conduct