dropbox / PyHive

Python interface to Hive and Presto. 🐝
Other
1.67k stars 549 forks source link

"TSocket read 0 bytes" during a long running Hive insert query #240

Open gseva opened 6 years ago

gseva commented 6 years ago

I'm running a long-ish insert query in Hive using PyHive 0.6.1 and it fails with thrift.transport.TTransport.TTransportException: TSocket read 0 bytes after about 5 minutes running. On the server side the query keeps running until finishing successfully. I don't have this problem with fast queries.

The environment in which this happens is a Docker container based on python:3.6-slim. Among other things, i'm installing libsasl2-dev and libsasl2-modules packages, and pyhive[hive] python package. I can't reproduce it locally on my Mac with the same python version: the code correctly waits untill the query finishes.

Any clue why this is happening? Thanks in advance.

The code i'm using is:

import contextlib
from pyhive.hive import connect

def get_conn():
    return connect(
        host='my-host',
        port=10000,
        auth='NONE',
        username='username',
        database='database'
    )

with contextlib.closing(get_conn()) as conn, \
        contextlib.closing(conn.cursor()) as cur:
    cur.execute('My long insert statement')

This is the full traceback

Traceback (most recent call last):
  File "<stdin>", line 5, in <module>
  File "/usr/local/lib/python3.6/site-packages/pyhive/hive.py", line 364, in execute
    response = self._connection.client.ExecuteStatement(req)
  File "/usr/local/lib/python3.6/site-packages/TCLIService/TCLIService.py", line 280, in ExecuteStatement
    return self.recv_ExecuteStatement()
  File "/usr/local/lib/python3.6/site-packages/TCLIService/TCLIService.py", line 292, in recv_ExecuteStatement
    (fname, mtype, rseqid) = iprot.readMessageBegin()
  File "/usr/local/lib/python3.6/site-packages/thrift/protocol/TBinaryProtocol.py", line 134, in readMessageBegin
    sz = self.readI32()
  File "/usr/local/lib/python3.6/site-packages/thrift/protocol/TBinaryProtocol.py", line 217, in readI32
    buff = self.trans.readAll(4)
  File "/usr/local/lib/python3.6/site-packages/thrift/transport/TTransport.py", line 60, in readAll
    chunk = self.read(sz - have)
  File "/usr/local/lib/python3.6/site-packages/thrift_sasl/__init__.py", line 166, in read
    self._read_frame()
  File "/usr/local/lib/python3.6/site-packages/thrift_sasl/__init__.py", line 170, in _read_frame
    header = self._trans.readAll(4)
  File "/usr/local/lib/python3.6/site-packages/thrift/transport/TTransport.py", line 60, in readAll
    chunk = self.read(sz - have)
  File "/usr/local/lib/python3.6/site-packages/thrift/transport/TSocket.py", line 132, in read
    message='TSocket read 0 bytes')
thrift.transport.TTransport.TTransportException: TSocket read 0 bytes

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 5, in <module>
  File "/usr/local/lib/python3.6/contextlib.py", line 185, in __exit__
    self.thing.close()
  File "/usr/local/lib/python3.6/site-packages/pyhive/hive.py", line 221, in close
    response = self._client.CloseSession(req)
  File "/usr/local/lib/python3.6/site-packages/TCLIService/TCLIService.py", line 218, in CloseSession
    return self.recv_CloseSession()
  File "/usr/local/lib/python3.6/site-packages/TCLIService/TCLIService.py", line 230, in recv_CloseSession
    (fname, mtype, rseqid) = iprot.readMessageBegin()
  File "/usr/local/lib/python3.6/site-packages/thrift/protocol/TBinaryProtocol.py", line 134, in readMessageBegin
    sz = self.readI32()
  File "/usr/local/lib/python3.6/site-packages/thrift/protocol/TBinaryProtocol.py", line 217, in readI32
    buff = self.trans.readAll(4)
  File "/usr/local/lib/python3.6/site-packages/thrift/transport/TTransport.py", line 60, in readAll
    chunk = self.read(sz - have)
  File "/usr/local/lib/python3.6/site-packages/thrift_sasl/__init__.py", line 166, in read
    self._read_frame()
  File "/usr/local/lib/python3.6/site-packages/thrift_sasl/__init__.py", line 170, in _read_frame
    header = self._trans.readAll(4)
  File "/usr/local/lib/python3.6/site-packages/thrift/transport/TTransport.py", line 60, in readAll
    chunk = self.read(sz - have)
  File "/usr/local/lib/python3.6/site-packages/thrift/transport/TSocket.py", line 132, in read
    message='TSocket read 0 bytes')
thrift.transport.TTransport.TTransportException: TSocket read 0 bytes
mauza commented 5 years ago

We are periodically getting this same error. We haven't found a solution. Some ideas are hiveserver2 is set to http protocol instead of binary, the server is for some reason severing the connection...?

gseva commented 5 years ago

@mauza We had to rollback to an earlier version of pyhive. These are the versions we're using: thrift==0.9.3 thrift_sasl==0.2.1 pyhive[hive]==0.2.1

mauza commented 5 years ago

Thanks for sharing those versions. Looks like there might be some breaking changes in the version of pyhive we were using, but I'll work through those tomorrow. Maybe I should start a new thread because our problem is fairly intermittent and doesn't require a long running hive insert query...

meatheadmike commented 5 years ago

Same issue here. Long sql statements simply freeze up for me and I eventually get that timeout. I'm running under docker with debian:stretch.

noniu commented 5 years ago

Same issue here. after about 10 minutes running.

kupi93 commented 5 years ago

I'm also struggling with same issue.

wildcardops commented 5 years ago

Same issue here

JohnOmernik commented 5 years ago

Is there any update on this? Getting this issue.

snivas commented 5 years ago

Any updated?

emhlbmc commented 5 years ago

Same issue

AndreyEmelyanenko commented 5 years ago

Same issue. Hive 2.3.3.

SuiMingYang commented 4 years ago

I change my hive's port from 22 to 10000, it works, maybe a help to you.

JavadBahoosh commented 4 years ago

@mauza We had to rollback to an earlier version of pyhive. These are the versions we're using: thrift==0.9.3 thrift_sasl==0.2.1 pyhive[hive]==0.2.1

Thanks @gseva. that saved my day.

xianyinxin commented 4 years ago

I encountered the same problem when i use 'auth=NOSASL'. Then I changed to 'auto=NONE' and I encountered another problem: 'TSaslClientTransport' object has no attribute 'readAll'. The later one is because the default installed thrift_sasl (0.2.1) is not compatibal with python3. So after upgrading it, the problem resolved.

final configs: python: 3.6 pyhive: 0.6.2 thrift: 0.13.0 thrift_sasl: 0.4.2

wilberh commented 4 years ago

Was this problem fixed in latest version?

wilberh commented 4 years ago

Here's a way to prevent connection reset/dropped on long running queries -

(solved) Hive - connection dropped before job is done in Hadoop
https://github.com/dropbox/PyHive/issues/358

https://github.com/dropbox/PyHive/tree/v0.6.2#db-api-asynchronous Doing an asynchronous sql query prevents the connection from getting dropped in sql queries that take long.

fyi - if using an ORM (like peewee / sqlalchemy), then get the cursor from the "database" object. example in peewee: database.get_cursor()

aidan-melen commented 4 years ago

I encountered the same problem when i use 'auth=NOSASL'. Then I changed to 'auto=NONE' and I encountered another problem: 'TSaslClientTransport' object has no attribute 'readAll'. The later one is because the default installed thrift_sasl (0.2.1) is not compatibal with python3. So after upgrading it, the problem resolved.

final configs: python: 3.6 pyhive: 0.6.2 thrift: 0.13.0 thrift_sasl: 0.4.2

Also worked on python: 3.8