Closed reitzig closed 4 years ago
Ok, this is an interesting case.
1) Which version of "websocket-client" do you have? How to check:
import websocket
print(websocket.__version__)
2) Could you reproduce the same problem with debug=True
and attach the log?
Thank you.
Hey, I work with @reitzig on this issue. Regarding your first question:
Python 3.7.6 (default, Jan 3 2020, 23:53:24)
[GCC 6.3.0 20170516] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import websocket
>>> print(websocket.__version__)
0.57.0
Second question: Should we set debug=True
in pyexasol.connect()
?
Regards Michael
One more question.
Do you use threading
or multiprocessing
of any kind in the main script? If so, do you share the same connection across multiple processes or threads?
There is a protection from threading problems, but multiprocessing problems after fork()
might be a real issue.
@mymindwentblvnk , yes, debug
is just an option for pyexasol.connect()
.
Thank you!
Hey @wildraid, we do not parallelize anything at the moment, but already thought about doing so. So thanks for the heads up.
Will add the debug
flag now. ~Where can the important logs be found afterwards?~
Nevermind. Will set debug_logdir
too.
So here are the logs for the problematic query:
2020-02-27 09:13:32.339 [WebSocket request #396]
{
"command": "execute",
"sqlText": "MERGE INTO <snip>
USING <snip> landing_inserts
ON ( lake_table.<snip> = landing_inserts.<snip> )
WHEN MATCHED THEN
UPDATE SET lake_table.<snip> = landing_inserts.<snip> , ..., lake_table.<snip>=CURRENT_TIMESTAMP
WHEN NOT MATCHED THEN
INSERT VALUES (...)"
}
2020-02-27 09:13:35.350 [WebSocket response #396]
{
"status": "error",
"exception": {
"text": "WebSocket: readMessage: Unsupported WebSocket opcode: 7 (Session: 1659680558719702237)",
"sqlCode": "00000"
}
}
Thanks already 👍 Michael
This is how we create the EXASOL connection:
conn = pyexasol.connect(dsn=exa_host + ':' + exa_port,
user=exa_user,
password=exa_password,
encryption=True,
compression=True,
debug=True,
debug_logdir='/usr/local/<snip>/log',
fetch_dict=True)
@mymindwentblvnk ,
1) Could you attach the whole log? SQL queries and response data can be removed or replaced if it's too long. There is some meta-information in requests which can be useful for investigation.
2) How long is the SQL query text which causes the problem? How long are "VALUES"? I suspect it might be reaching the maximum size of text / data frame, which causes client to send the second frame, and Exasol server might not recognise it properly.
How long is the SQL query text which causes the problem? How long are "VALUES"?
They are quite long, merging several hundreds of there. I don't think that changed a lot recently, and it's certainly on different between container vs. real Exasol (which we use in other deployment stages). Those run an older version, though, namely 6.1.2. @mymindwentblvnk has the log at hand, he can tell you how long the strings actually are.
I suspect it might be reaching the maximum size of text / data frame, which causes client to send the second frame, and Exasol server might not recognise it properly.
If there's something we could reconfigure in that regard, we could try it out. Do you know something about that?
The full log. The last query causes the problem and it has a length of 12899 characters. The one before with 6815 characters works fine.
2020-02-27 09:12:02.192 Connection attempt [172.27.0.5:8888]
2020-02-27 09:12:02.215 [WebSocket request #1]
{
"command": "login",
"protocolVersion": 1
}
2020-02-27 09:12:02.250 [WebSocket response #1]
{
"status": "ok",
"responseData": {
"publicKeyPem": "<snip>",
"publicKeyExponent": "010001",
"publicKeyModulus": "<snip>"
}
}
2020-02-27 09:12:02.286 [WebSocket request #2]
{
"username": "sys",
"password": "<snip>",
"driverName": "PyEXASOL 0.11.2",
"clientName": "PyEXASOL",
"clientVersion": "0.11.2",
"clientOs": "Linux-4.19.76-linuxkit-x86_64-with-debian-9.12",
"clientOsUsername": "airflow",
"clientRuntime": "Python 3.7.6",
"useCompression": true,
"attributes": {
"currentSchema": "",
"autocommit": true,
"queryTimeout": 0,
"snapshotTransactionsEnabled": false
}
}
2020-02-27 09:12:02.296 [WebSocket response #2]
{
"status": "ok",
"responseData": {
"protocolVersion": 1,
"timeZone": "EUROPE/BERLIN",
"timeZoneBehavior": "INVALID SHIFT AMBIGUOUS ST",
"sessionId": 1659680558719702237,
"maxDataMessageSize": 67108864,
"releaseVersion": "6.2.4",
"databaseName": "DB1",
"productName": "EXASolution",
"maxIdentifierLength": 128,
"maxVarcharLength": 2000000,
"identifierQuoteString": "\""
}
}
2020-02-27 09:12:02.296 [WebSocket request #3]
{
"command": "getAttributes"
}
2020-02-27 09:12:02.321 [WebSocket response #3]
{
"status": "ok",
"attributes": {
"datetimeFormat": "YYYY-MM-DD HH24:MI:SS.FF6",
"dateFormat": "YYYY-MM-DD",
"numericCharacters": ".,",
"dateLanguage": "ENG",
"queryTimeout": 0,
"timezone": "EUROPE/BERLIN",
"timezoneBehavior": "INVALID SHIFT AMBIGUOUS ST",
"defaultLikeEscapeCharacter": "\\",
"autocommit": true,
"compressionEnabled": true,
"currentSchema": "",
"snapshotTransactionsEnabled": false,
"openTransaction": 0
}
}
2020-02-27 09:12:03.242 [WebSocket request #4]
{
"command": "execute",
"sqlText": "<snip>"
}
2020-02-27 09:12:03.289 [WebSocket response #4]
{
"status": "ok",
"responseData": {
"results": [
{
"resultType": "rowCount",
"rowCount": 0
}
],
"numResults": 1
}
}
2020-02-27 09:12:03.290 [WebSocket request #5]
{
"command": "execute",
"sqlText": "COMMIT"
}
2020-02-27 09:12:03.301 [WebSocket response #5]
{
"status": "ok",
"responseData": {
"results": [
{
"resultType": "rowCount",
"rowCount": 0
}
],
"numResults": 1
}
}
2020-02-27 09:12:03.302 [WebSocket request #6]
{
"command": "execute",
"sqlText": "<snip>"
}
2020-02-27 09:12:03.346 [WebSocket response #6]
{
"status": "ok",
"responseData": {
"results": [
{
"resultType": "rowCount",
"rowCount": 0
}
],
"numResults": 1
}
}
2020-02-27 09:12:03.347 [WebSocket request #7]
{
"command": "execute",
"sqlText": "COMMIT"
}
2020-02-27 09:12:03.357 [WebSocket response #7]
{
"status": "ok",
"responseData": {
"results": [
{
"resultType": "rowCount",
"rowCount": 0
}
],
"numResults": 1
}
}
// ...
// There are over hundred queries like the one above
// ...
2020-02-27 09:13:31.495 [WebSocket request #394]
{
"command": "execute",
"sqlText": "<ANOTHER LONG QUERY>"
}
2020-02-27 09:13:32.325 [WebSocket response #394]
{
"status": "ok",
"responseData": {
"results": [
{
"resultType": "rowCount",
"rowCount": 0
}
],
"numResults": 1
}
}
2020-02-27 09:13:32.325 [WebSocket request #395]
{
"command": "execute",
"sqlText": "COMMIT"
}
2020-02-27 09:13:32.337 [WebSocket response #395]
{
"status": "ok",
"responseData": {
"results": [
{
"resultType": "rowCount",
"rowCount": 0
}
],
"numResults": 1
}
}
2020-02-27 09:13:32.339 [WebSocket request #396]
{
"command": "execute",
"sqlText": "<THE LONG QUERY THAT CAUSES THE PROBLEM>"
}
2020-02-27 09:13:35.350 [WebSocket response #396]
{
"status": "error",
"exception": {
"text": "WebSocket: readMessage: Unsupported WebSocket opcode: 7 (Session: 1659680558719702237)",
"sqlCode": "00000"
}
}
@reitzig ,
If there's something we could reconfigure in that regard, we could try it out. Do you know something about that?
If this is the case, I'll isolate the reproducible test case and raise a ticket for Exasol support.
@mymindwentblvnk , @reitzig , could you try to disable compression
and repeat the test? Compression might implicitly change the frame type to "binary" instead of "text", which never caused any problems before. But maybe Exasol changed something specifically in 6.2+.
Also, if you have access to SQL process logs, the last few lines before error might be helpful. It is possible to access such logs via ExaOperation or checking logs inside docker container.
Thank you!
I did some tests locally and could not reproduce the problem. I guess, it's not as simple as "too long query".
Could you try to open a new session by calling pyexasol.connect()
one more time before executing the problematic query? Let's see if it changes anything.
Thank you.
I think I've managed to reproduce it. I'll raise a ticket for Exasol support on Monday and see if we can come up with workaround.
It seems, only version 6.2.4 is affected.
Thank you, @wildraid.
Reverted to 6.2.3; can confirm that this issue goes away.
Thanks for handling and pushing this, @wildraid!
@reitzig , @mymindwentblvnk , this issue was related to encryption=True
. It will be fixed in 6.2.5: https://www.exasol.com/support/browse/EXASOL-2645
Thank you for discovering this. 👍
Thank you for taking care of it. We are looking forward to the new version :)
We can indeed not reproduce the issue with Exasol 6.2.5 (and pyexasol 0.12.0) -- thanks!
We're seeing low-level errors bubble up through pyexasol which we can't explain. For example:
This is after several queries ran through successfully; it is the first
MERGE
, though.Re-running the same code yielded the same error with
opcode: 12
. We've also seenopcode 14
once on another query, but have not been able to reproduce.This is running queries with pyexasol 0.11.2 on Python 3.7.6 against
exasol/docker-db:6.2.4-d1
.