389ds / 389-ds-base

The enterprise-class Open Source LDAP server for Linux
https://www.port389.org/
Other
211 stars 90 forks source link

Work queue contention #5338

Open jchapma opened 2 years ago

jchapma commented 2 years ago

Description Contention has been identified on the connection work queue, this issue aims to resolve that.

Additional context See the design discussion on this topic here: https://github.com/mreynolds389/389wiki/pull/77

jchapma commented 2 years ago

This is a draft implementation of a lockless work queue to address the contention observed when the server is processing connections. However, the design of the lockless queue means it is unsafe to free a node after it is dequeued as another thread could be using it. I have noticed this behaviour intermittently, where a thread tries to access a node that has been freed, breaking the linked list.

Possible solutions to this problem: Use a double word compare and swap, but this is not available on commonly used hardware. Use node pointer modification counters to only free a node when no thread is using it.

I am currently working on this, hence the WIP label.

jchapma commented 2 years ago

Rebase

tbordaz commented 1 year ago

When #4812 is fixed and this one also, expected gain are big (>+50% throughput). Our tests will give tuning recommendations that we can implement (another ticket) and describe in the doc.

Firstyear commented 1 year ago

Please don't roll your own lockless queue, there are so many subtleties to it that can break things. It would be better to find an existing lockfree queue library to use that has extensive testing and analysis performed, or to roll up the crossbeam rust lock free queue with a C ffi api.

PS: DWCAS is common, it's on every modern cpu except a single generation of extremely early AMD 64 bit cpus.