Closed srikrbha closed 2 years ago
I did some checks and managed to reproduce the issue.
It looks like uwsgi
takes too much liberty to redefine sys.executable
path. It stores path to uwsgi
there instead of path to Python interpreter. And this part no longer works properly:
args = [sys.executable,
'-m', 'pyexasol_utils.http_transport',
'--host', self.host,
'--port', str(self.port),
'--mode', self.mode,
'--ppid', str(os.getpid())
]
Relevant links: https://github.com/unbit/uwsgi/issues/670 https://bugs.python.org/issue36196
In theory, subprocess
can be replaced with multiprocessing
here, but it will cause problems on Windows, which is unable to "fork" properly.
Currently I don't see an easy fix for this which can maintain backwards compatibility.
However, it looks like some parameter was added to set sys.executable
manually for uwsgi. https://github.com/unbit/uwsgi/commit/b6308cae818dab78da5f51eae8c903b6e2122b7a
But I am not sure if it was merged and how to set it.
Hope it helps!
Hi @srikrbha and @wildraid,
I tried the patch from https://github.com/unbit/uwsgi/commit/b6308cae818dab78da5f51eae8c903b6e2122b7a and it seems to work.
I installed it via:
pip install https://github.com/unbit/uwsgi/archive/b6308cae818dab78da5f51eae8c903b6e2122b7a.zip
and started uwsgi then with
uwsgi --http :9090 --wsgi-file pyex_dummy.py --py-executable venv/bin/python3.6
where venv/bin/python3.6
is in my case the path to my python binary.
However, it seems this patch is not yet in the stable releases on pypi, so maybe voting for this patch in the uwsgi repository helps.
Hi,
In general export_to_pandas()
is not working now which was fine before.
raise cls_err(self, req['sqlText'], ret['exception']['sqlCode'], ret['exception']['text'])
pyexasol.exceptions.ExaQueryError:
(
message => ETL-5106: Following error occured while writing data to external connection [https://000.gz/ failed after 0 bytes. [Could not resolve host: 000.gz],[6],[Cou
ldn't resolve host name]] (Session: 1698910877895153538)
dsn => xxx
user => xxx
schema =>
session_id => 1698910877895153538
code => 42636
query => EXPORT (
SELECT * FROM EXA_SYSCAT
) INTO CSV
AT 'https://' FILE '000.gz'
WITH COLUMN NAMES
)
What is the current fix for this ?
Hi @venkatrajgopal17 ,
What do you mean with 'in general' export_to_pandas
doesn't work anymore? How did you run it? Did you use the pyex_dummy.py and the patched version of uwsgi?
@wildraid Could we maybe use multiprocessing, together with an conditional import using os.name or platform.system() to maintain backwards compatibility with Windows ?
Hi @venkatrajgopal17 ,
What do you mean with 'in general'
export_to_pandas
doesn't work anymore? How did you run it? Did you use the pyex_dummy.py and the patched version of uwsgi?
I have tried but the patch installation doesnt work with my venv.
*** error linking uWSGI ***
----------------------------------------
ERROR: Command errored out with exit status 1: '/mnt/d/Project/Pyexasol_connection/exavenv/bin/python3' -u -c 'import sys, setuptools, tokenize; sys.argv[0] =
'"'"'/tmp/pip-req-build-t3vbe8gb/setup.py'"'"'; __file__='"'"'/tmp/pip-req-build-t3vbe8gb/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace
('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-bvnapcde/install-record.txt --single-version-externally-
managed --compile --install-headers '/mnt/d/Project/Pyexasol_connection/exavenv/include/site/python3.8/uWSGI' Check the logs for full command output.
At the moment i simply used pd.DataFrame(stmt.fetchall(), columns=stmt.column_names())
to parse into a pandas df.
@daschnerm it is possible in theory, but it will take a few days to implement & test properly. Also, it will create two major branches of logic, with one branch not being tested at all, since Travis does not support OS Windows.
Also, the current code is re-used both for normal HTTP transport and parallel HTTP transport, which may run on multiple servers, and multiprocessing will not help.
So.. definitely possible, but at high cost with almost no reward. It may save about 100-300ms by removing the cost of starting up a new Python process, but that's it.
Hi @wildraid,
I maybe had an idea for a small workaround. Would it be possible, that we provide an Environment Variable which can specify the python interpreter? It would be easier to test and way less invasive.
Closing this issue, continue in #79.
Hello, I have found that the usage of
export_to_pandas
(or anyexport_to_*
which in turn callsexport_to_callback
) results in an ExaQueryError when used in auwsgi
based web application. Suspecting that this could be due to the multi-threading operation happening within the function. Following is the stack-trace and repro steps:Start a sample uwsgi server using pyex_dummy.py using the following command:
Hit the newly setup endpoint with the curl call in a separate terminal window:
pyex_dummy_no_creds.py.zip
The following stacktrace is observed within the server window:
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "test.py", line 14, in application df = connection.export_to_pandas(query) File "/opt/conda3/lib/python3.7/site-packages/pyexasol/connection.py", line 271, in export_to_pandas return self.export_to_callback(cb.export_to_pandas, None, query_or_table, query_params, callback_params, export_params) File "/opt/conda3/lib/python3.7/site-packages/pyexasol/connection.py", line 335, in export_to_callback raise sql_thread.exc File "/opt/conda3/lib/python3.7/site-packages/pyexasol/http_transport.py", line 34, in run self.run_sql() File "/opt/conda3/lib/python3.7/site-packages/pyexasol/http_transport.py", line 153, in run_sql self.connection.execute("\n".join(parts)) File "/opt/conda3/lib/python3.7/site-packages/pyexasol/connection.py", line 186, in execute return self.cls_statement(self, query, query_params) File "/opt/conda3/lib/python3.7/site-packages/pyexasol/statement.py", line 55, in init self._execute() File "/opt/conda3/lib/python3.7/site-packages/pyexasol/statement.py", line 159, in _execute 'sqlText': self.query, File "/opt/conda3/lib/python3.7/site-packages/pyexasol/connection.py", line 572, in req raise cls_err(self, req['sqlText'], ret['exception']['sqlCode'], ret['exception']['text']) pyexasol.exceptions.ExaQueryError: ( message => ETL-5106: Following error occured while writing data to external connection [http://000.gz/ failed after 0 bytes. [Could not resolve host: 000.gz],[6],[Couldn't resolve host name]] (Session: 1698365725588979714) dsn => DEMODB.EXASOL.COM user => PUB3511 schema =>
session_id => 1698365725588979714 code => 42636 query => EXPORT ( SELECT * FROM EXA_SYSCAT ) INTO CSV AT 'http://' FILE '000.gz' WITH COLUMN NAMES )