omnilib / aiosqlite

asyncio bridge to the standard sqlite3 module
https://aiosqlite.omnilib.dev
MIT License
1.22k stars 94 forks source link

Reduce CPU usage and locking in the connection thread loop #271

Closed bdraco closed 8 months ago

bdraco commented 10 months ago

The current design uses a Queue and times out every 0.1s to check if the thread should terminate. This wakes up the thread every 0.1 even if there is no work to do. An strace of the python process with -f will show this behavior as well as the futex.

To reduce the overhead, the Queue has been replaced with a SimpleQueue since the minimum python version was bumped to 3.7+ (3.8), and the use case does not require complex functionality. SimpleQueue avoids all the threading locks. Additionally the queue consumer thread now terminates when it sees the _STOP_RUNNING_SENTINEL so it does not have to wake up every 0.1s to check if it needs to terminate.

This solution builds on #213 (thanks @vovukman) and takes into consideration the sqlalchemy use case detailed in https://github.com/omnilib/aiosqlite/pull/213#issuecomment-1670856398

bdraco commented 10 months ago

Verified thread looks good on the trace

2024-01-06 01:42:31.252 CRITICAL (MainThread) [homeassistant.components.profiler] Thread [Thread-4]: File "/usr/local/lib/python3.12/threading.py", line 1030, in _bootstrap
    self._bootstrap_inner()
  File "/usr/local/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
    self.run()
  File "/usr/local/lib/python3.12/site-packages/aiosqlite/core.py", line 107, in run
    tx_item = self._tx.get()
bdraco commented 10 months ago

strace the process, all the thread switching every 0.1s and futexs are gone now 👍

bdraco commented 10 months ago

Verified smoke tests pass

Testing on Python 3.12.0 perf.py

Before

Perf Test Iterations Duration Rate iterable_cursor @ 1024 2646017 2.0s 1322525.6/s iterable_cursor @ 128 1218049 2.0s 608890.0/s iterable_cursor @ 16 242561 2.0s 121261.3/s iterable_cursor @ 256 1776897 2.0s 888253.0/s iterable_cursor @ 32 436641 2.0s 218276.9/s iterable_cursor @ 512 2314241 2.0s 1156938.9/s iterable_cursor @ 64 758337 2.0s 379106.9/s atomics 5639 2.0s 2819.2/s connection_file 5216 2.0s 2607.8/s connection_memory 5598 2.0s 2798.8/s insert_ids 4108 2.0s 2053.6/s insert_macro_ids 8470 2.0s 4234.5/s inserts 8524 2.0s 4261.4/s select 7593 2.0s 3796.0/s select_macro 13951 2.0s 6974.5/s

After

Perf Test Iterations Duration Rate iterable_cursor @ 1024 2710529 2.0s 1354900.0/s iterable_cursor @ 128 1205761 2.0s 602742.5/s iterable_cursor @ 16 247601 2.0s 123778.5/s iterable_cursor @ 256 1786113 2.0s 892894.9/s iterable_cursor @ 32 445185 2.0s 222557.5/s iterable_cursor @ 512 2308680 2.0s 1154183.6/s iterable_cursor @ 64 751489 2.0s 375663.4/s atomics 5802 2.0s 2900.3/s connection_file 5985 2.0s 2992.4/s connection_memory 6471 2.0s 3235.1/s insert_ids 4331 2.0s 2164.9/s insert_macro_ids 8645 2.0s 4321.7/s inserts 8664 2.0s 4331.3/s select 7663 2.0s 3830.7/s select_macro 14155 2.0s 7076.6/s

bdraco commented 10 months ago

Validated this on my production Home Assistant instance.

amyreese commented 8 months ago

Thank you @vovukman for the initial PR, and thank you @bdraco for your additional work. Sorry it took so long to get this into main. 🥂