solana-labs / solana

Web-Scale Blockchain for fast, secure, scalable, decentralized apps and marketplaces.
https://solanalabs.com
Apache License 2.0
13.03k stars 4.19k forks source link

Lower http worker thread priority #14556

Closed ryoqun closed 2 years ago

ryoqun commented 3 years ago

Problem

http worker thread are scheduled equally as other threads. This can stall the validator pretty easily by indirectly chogging the machine via cpu usage saturation.

If I manually reniced the http threads*, I observed generally more favorable validator sync status even under the heavy load of rpc reqs.

*: $ ps -e -T | grep http | awk '{print $2}' | while read pid; do sudo renice -n 20 -p $pid; done

Proposed Solution

Just lower the thread priority via some platform-dependant crate.

Be careful a bit for the lock priority inversion; but it should generally better than nothing. At least, we just need dictonomy of critical replay threads (including account background service) and other threads (http worker) and make sure replay threads never depend on other threads. (strictly, it does still via AccountsDB locks).

Also, as far as I checked our rpc generally doesn't lock except obvious one (like getProgramAccounts). So, not that super high priority. Anyway, I managed the validator with getConfirmedBlock, so lowering serialization done by the http worker thread will be one of the few wins from here.

im-0 commented 2 years ago

Please check https://github.com/solana-labs/solana/pull/21019

github-actions[bot] commented 2 years ago

This issue has been automatically locked since there has not been any activity in past 7 days after it was closed. Please open a new issue for related bugs.