Closed devIonutApostol closed 4 years ago
I need to understand what problem that the threadpool would solve. How did you determine that the asynctimeout thread gets cpu time once every 3-4 seconds when it should be sleeping for only 300ms? Why does using the threadpool cause a thread to get more cpu time?
I will describe what happens. Usually happens 5-10 times a day , only for one random pod.
Healthchecks on aerospike server side doesn't found anything wrong, we have 3 nodes.
Async Client Configuration
AsyncClientPolicy aerospikePolicy = new AsyncClientPolicy { asyncMaxCommandAction = MaxCommandAction.REJECT, asyncMaxCommands = 1000, maxSocketIdle = 0, writePolicyDefault = { commitLevel = CommitLevel.COMMIT_MASTER, totalTimeout = 300, maxRetries = 1 }, readPolicyDefault = { totalTimeout = 300, maxRetries = 1 } };
As regarding of moving those threads on the threadpool, is more actually a good practice. We just want some kind of option to call those tend and asynctimeout actions.
We modified the AsyncTimeoutQueue Run method to log the frequency of running under heavy load.
The result is something like this
AsyncTimeout run: elapsed milliseconds since last run: 317 sleep interval: 301 AsyncTimeout run: elapsed milliseconds since last run: 328 sleep interval: 301 AsyncTimeout run: elapsed milliseconds since last run: 2031 sleep interval: -1 AsyncTimeout run: elapsed milliseconds since last run: 321 sleep interval: 301 AsyncTimeout run: elapsed milliseconds since last run: 318 sleep interval: 301 AsyncTimeout run: elapsed milliseconds since last run: 1880 sleep interval: -1 AsyncTimeout run: elapsed milliseconds since last run: 300 sleep interval: 301 AsyncTimeout run: elapsed milliseconds since last run: 301 sleep interval: 301 AsyncTimeout run: elapsed milliseconds since last run: 322 sleep interval: 301 AsyncTimeout run: elapsed milliseconds since last run: 2887 sleep interval: -1 AsyncTimeout run: elapsed milliseconds since last run: 321 sleep interval: 301 AsyncTimeout run: elapsed milliseconds since last run: 330 sleep interval: 301 AsyncTimeout run: elapsed milliseconds since last run: 2694 sleep interval: -1 AsyncTimeout run: elapsed milliseconds since last run: 320 sleep interval: 301 AsyncTimeout run: elapsed milliseconds since last run: 327 sleep interval: 301 AsyncTimeout run: elapsed milliseconds since last run: 2116 sleep interval: -1 AsyncTimeout run: elapsed milliseconds since last run: 324 sleep interval: 301 AsyncTimeout run: elapsed milliseconds since last run: 308 sleep interval: 301 AsyncTimeout run: elapsed milliseconds since last run: 1872 sleep interval: -1 AsyncTimeout run: elapsed milliseconds since last run: 301 sleep interval: 301 AsyncTimeout run: elapsed milliseconds since last run: 331 sleep interval: 301 AsyncTimeout run: elapsed milliseconds since last run: 2049 sleep interval: -1 AsyncTimeout run: elapsed milliseconds since last run: 302 sleep interval: 301 AsyncTimeout run: elapsed milliseconds since last run: 323 sleep interval: 301
After hacking something and moving this action on the threadpool, disabled that thread, same load
calling it like this _ = RunAerospikeTimeout();
yncTimeout run: elapsed milliseconds since last run 337 AsyncTimeout run: elapsed milliseconds since last run 325 AsyncTimeout run: elapsed milliseconds since last run 344 AsyncTimeout run: elapsed milliseconds since last run 343 AsyncTimeout run: elapsed milliseconds since last run 317 AsyncTimeout run: elapsed milliseconds since last run 317 AsyncTimeout run: elapsed milliseconds since last run 334 AsyncTimeout run: elapsed milliseconds since last run 304 AsyncTimeout run: elapsed milliseconds since last run 324 AsyncTimeout run: elapsed milliseconds since last run 332 AsyncTimeout run: elapsed milliseconds since last run 323 AsyncTimeout run: elapsed milliseconds since last run 323 AsyncTimeout run: elapsed milliseconds since last run 345 AsyncTimeout run: elapsed milliseconds since last run 357 AsyncTimeout run: elapsed milliseconds since last run 331 AsyncTimeout run: elapsed milliseconds since last run 305 AsyncTimeout run: elapsed milliseconds since last run 314 AsyncTimeout run: elapsed milliseconds since last run 311 AsyncTimeout run: elapsed milliseconds since last run 324 AsyncTimeout run: elapsed milliseconds since last run 347 AsyncTimeout run: elapsed milliseconds since last run 359 AsyncTimeout run: elapsed milliseconds since last run 346 AsyncTimeout run: elapsed milliseconds since last run 347 AsyncTimeout run: elapsed milliseconds since last run 340 AsyncTimeout run: elapsed milliseconds since last run 334 AsyncTimeout run: elapsed milliseconds since last run 312
AsyncTimeoutQueue is performing as designed. If there are no pending timer requests, the cancelToken wait is set to -1 (infinite). When a new timer request is added, the cancelToken is woken up and the next cancelToken wait is set to "timeout + 1". The long sleeps occur because there are no timer requests during that time.
The threadpool alternative does not improve on the original design for AsyncTimeoutQueue or tend. I prefer the control of deadicated threads over the shared thread support in threadpool.
Thanks for clarifications, after further investigations it seems that I was wrong about this.
The sdk creates 2 manual threads (tend and asynctimeout). I would prefer to move that logic to the threadpool, for example in some background tasks because in some cases even if we have a timeout of 300ms, I see that the asynctimeout thread gets CPU time every 3-4 seconds. Using cancellation tokens works ok, but still we want those actions on the threadpool.
Context: asp.net core, .net 5 rc1, high traffic (~12k IO per sec)
Should I make a PR ?