Closed srikrbha closed 2 years ago
Could you try to add option --enable-threads
and remove limitations to number of threads altogether? Just let Python create as many threads as it wants to.
I suspect, you might be hitting "20 threads" limit due to WebSocket client background activity.
PyEXASOL uses only 1 extra thread to monitor the execution status of SQL query. But WebSockets require a lot of threads to read / write asynchronously, check timeouts and send / receive PING / PONG frames.
@littleK0i The original uwsgi.ini
file that I used had --enable-threads
. But one question, any reason why a sub-process is facing the thread limit? Shouldn't it be out of the limits set by uwsgi? Or is it because it inherits more than just the env vars and the sockets?
Note: The subprocess was spawned using the python subprocess
module with the uwsgi server process being the parent process.
Well, technically when you create a new process in Linux, it "forks" and child inherits everything from parent. It includes all sockets, threads, resources, memory, almost everything. So I would not be surprised if thread limits are inherited as well.
Did it help to remove the limit?
I was thinking about creating a second implementation for HTTP transport, using threads only. No subprocess, no forking.
Removing the thread limit did work, but, we want to keep it since in our case we have limited cpus allocated to the server spawned and we want to limit the threads spawned to prevent any contention. On that note, within the same server process, we do have other thread pools that run as well, but they have never had any such problems. Or could it be that they didn't have any problems as they aren't contending for any sockets? On that note, to rephrase the problem, what I have been observing from the logs is that the HTTP transport thread that communicates with exasol does get created, but, the thread gets stuck at exception handling and joining.
Note: Just noticed that I didn't give you any logs from one of my tries. Big mistake on my part! Here it is: PyEXASOL_20210909_035721_897429.log
I have a theory.
Maybe it gets stuck because it is unable to terminate subprocess due to this behaviour: https://uwsgi-docs.readthedocs.io/en/latest/ThingsToKnow.html
Till uWSGI 2.1, by default, sending the SIGTERM signal to uWSGI means “brutally reload the stack” while the convention is to shut an application down on SIGTERM. To shutdown uWSGI use SIGINT or SIGQUIT instead. If you absolutely can not live with uWSGI being so disrespectful towards SIGTERM, by all means enable the die-on-term option. Fortunately, this bad choice has been fixed in uWSGI 2.1
PyEXASOL uses .terminate
to send SIGTERM and stop the subprocess listening to data stream from EXPORT. If uwsgi overloads signal handlers for SIGTERM, it might be inherited by subprocess.
You may test it by replacing .terminate()
call with .kill()
here: https://github.com/badoo/pyexasol/blob/master/pyexasol/http_transport.py#L269
@littleK0i Thanks for the info. Will check and get back.
@littleK0i You were right with your theory, .kill()
indeed works. @srikrbha tested the uWSGI flag die-on-term
as well, however this did not work. Is there anything that speaks against exchanging .terminate()
with kill()
?
It's possible to swap it with .kill()
, but it's definitely not the best practice when we can terminate normally.
Could you try to reset SIGTERM signal handler to default? In order to do so, please the following code add after this line: https://github.com/badoo/pyexasol/blob/master/pyexasol_utils/http_transport.py#L236
import signal
signal.signal(signal.SIGTERM,signal.SIG_DFL)
It should reset signal handler to default, unless uWSGI added even more weirdness.
@daschnerm , @srikrbha
Indeed, child process inherits signals of parent process by design.
I've added reset of signal handler in 0.21.1
. It should fix this specific problem.
Code: https://github.com/badoo/pyexasol/commit/7d2921456c740e755f5e96dc9a14dec8babc9713
@srikrbha , @daschnerm , If it is still relevant, please try to use pyexasol latest version (0.23.1+
) with uWSGI.
HTTP transport was fully reworked to use threads instead of child processes. Subprocesses are no longer started. sys.executable
, signal handling and many other problems should be resolved automatically.
We may see some new unforeseen problems related specifically to threads, but it is unlikely.
Thank you.
@srikrbha, @daschnerm: the last message from @littleK0i suggests that this ticket can be closed, right?
Closing ticket. No further feedback. Should be solved.
After making a custom build of py-exasol while adding changes to
sys.executable
(changes in repo: https://github.com/daschnerm/pyexasol/tree/master, based on discussion in issue: https://github.com/badoo/pyexasol/issues/73), we see that a subprocess spawned from the uwsgi server gets stuck while raising anExaQueryError
after callingexport_to_pandas
function. This problem is observed when we use thethreads = x
setting in uwsgi to limit the number of worker threads that get spawned by uwsgi.Sample repro repro.tar.gz:
subprocess_check.py
add the necessary exasol credentials and make sure that the internal query tries to access an inexistent table in the database.In case you need exasol logs, you can build the exasol from the following branch of the repo instead for debugging: https://github.com/daschnerm/pyexasol/tree/debugging.