Open dista opened 7 months ago
I'm interested in this which seems similar issue I met before. Does this PRhttps://github.com/Amanieu/parking_lot/pull/419 fix your problem?
I'm interested in this which seems similar issue I met before. Does this PR#419 fix your problem?
currently I just simply disable parking_lot in tokio. I run your PR bench in my machine. cargo run --release 32 2 10000 100
std::sync::Mutex avg 18.299529ms min 17.132839ms max 21.931748ms
parking_lot::Mutex avg 16.937637ms min 14.974131ms max 19.934233ms
spin::Mutex avg 30.254703ms min 12.640727ms max 54.775686ms
AmdSpinlock avg 31.261368ms min 14.366787ms max 61.797515ms
but honestly I do not like the idea of thread::sleep(1ms), I think it maybe hurt performance in other way
thread::sleep(1ms)
only happens after spin
, cpu_relax
, thread_yield
all failed which means a really heavy cacheline contention there. Therefore, the thread should "sleep" to avoid busy waiting.
but the choise of 1ms seems have no specific reason. why choose 1ms, not 0.5ms.
We have a video/audio streaming application build on tokio(which enable parking_lot by default), when parking_lot is enabled, when we use wrk to bench http output of the streaming application, the application is bottlenecked in cpu, no matter how many threads(32 threads for example, it should reach 3200% at most) we assign to tokio, the cpu of our application can not exceed 600%.
after disable parking_lot, we can reach the number we anticipated.
the benchmark for parking_lot is very pool in our server(AMD EPYC 7502, Rocky linux 9.3, Kernel 5.14.0-362.8.1.el9_3.x86_64) cargo run --release 32 2 10000 100
system information:
CPU
memory: 250G
numactl --show