cloudera / impyla

Python DB API 2.0 client for Impala and Hive (HiveServer2 protocol)
Apache License 2.0
725 stars 247 forks source link

client hangs when connecting to HiveServer2 #184

Open alexwoolford opened 8 years ago

alexwoolford commented 8 years ago

When I try to run the following code, the client hangs when trying to connect to Hive:

from impala.dbapi import connect

conn = connect(host='host_running_hs2_service', port=10000, user='awoolford', password='Bzzzzz')
cursor = conn.cursor()          <- hangs here
cursor.execute('show tables')
results = cursor.fetchall()
print results

I was able to connect to HiveServer2, via a Java client, and so it seems that the connectivity issue is Python/Impyla specific. When I debug/step-into, the code hangs at line 873 of hiveserver2.py.

Hive 1.1.0-cdh5.5.1 Python 2.7.11 | Anaconda 2.3.0 Impyla installed by pip today (2016-03-07).

Do you have any thoughts/suggestions?

wesm commented 8 years ago

It looks like PyHive connects to Hive using a different protocol:

https://github.com/dropbox/PyHive/blob/master/pyhive/hive.py#L87

Can you try changing V6 to V1 and see if that fixes your issue? If this works I can add a configuration option. I can also try to repro on my side.

alexwoolford commented 8 years ago

Thanks Wes. It's an honor!

I changed the client from V6 to V1 and it still hangs. I can share my screen via Google Hangout if that's helpful.

wesm commented 8 years ago

It would be nice to understand this issue if you do figure out -- unfortunately I don't have the bandwidth to help debug it further. If you are able to sort it out (and if it is an impyla bug) please let me know the resolution here. cc @mjacobs

amanjunath commented 8 years ago

@alexwoolford @wesm I was facing the same issue. Changing the auth_mechanism to 'PLAIN' worked for me. Here's the documentation in the code: auth_mechanism : {'NOSASL', 'PLAIN', 'GSSAPI', 'LDAP'} Specify the authentication mechanism.'NOSASL'for unsecured Impala. 'PLAIN'for unsecured Hive (because Hive requires the SASL transport).'GSSAPI'for Kerberos and'LDAP'for Kerberos with LDAP. The default used is NOSASL: def connect(host='localhost', port=21050, database=None, timeout=None, use_ssl=False, ca_cert=None, auth_mechanism='NOSASL', user=None, password=None, kerberos_service_name='impala', use_ldap=None, ldap_user=None, ldap_password=None, use_kerberos=None, protocol=None):

obaidcuet commented 8 years ago

@amanjunath thanks man. Below worked for me. Just adding auth_mechanism='PLAIN': conn = connect(host='myhost', port=10000, auth_mechanism='PLAIN', user='myuser', password=None)

chansonZ commented 7 years ago

not support ‘PLAIN’ TTransportException: Bad status: 3 (Unsupported mechanism type PLAIN)

darrkz commented 6 years ago

@wesm
Where my code hangs is results = cursor.fetchmany() <-- and it happen only occasionally and when I rerun my code, it goes ok...

joetristano commented 4 years ago

@wesm I've been running a load test on our cluster of 192 processes by 14 queries. I see that my machine establishes 192 connections to the cluster using netstat but via logging, I see that I only receive 180 cursors. The remaining 12 cursors hang for over 1 hour until they return a cursor object from the connection. Adjusting the impala maximum connections did not help. I am using a single connection and cursor to execute all 14 queries with a sleep in between each query to simulate concurrent user action.

Do you have any thoughts on this?

Thanks Joe

Hang occurs @ TOpenSessionReq


Attempting to   open transport (tries_left=2)
Transport opened
Establishing Connection
Connecting to HiveServer2 hostname:25003   with PLAIN authentication mechanism
get_socket: host=hostname port=25003   use_ssl=False ca_cert=None
sock=<thrift.transport.TSocket.TSocket   instance at 0x7f765fea0aa0>
get_transport:   socket=<thrift.transport.TSocket.TSocket instance at 0x7f765fea0aa0>   host=hostname kerberos_service_name=impala auth_mechanism=PLAIN   user=userpassword=fuggetaboutit
transport=<thrift_sasl.TSaslClientTransport   instance at 0x7f765fea0e60>   protocol=<thrift.protocol.TBinaryProtocol.TBinaryProtocolAccelerated   instance at 0x7f765fea7140>   service=<impala._thrift_gen.ImpalaService.ImpalaHiveServer2Service.Client   object at 0x7f765fe9dd50>
HiveServer2Connection(service=<impala.hiveserver2.HS2Service   object at 0x7f765fe9dd90>, default_db=co5012_cpi_int)
Connection Established
Acquiring Cursor
Getting a cursor (Impala session)
.cursor(): getting new session_handle
OpenSession:   req=TOpenSessionReq(username='root', password=None, client_protocol=5,   configuration=None)
Attempting to open transport   (tries_left=3)
Transport opened

Here' s more info from another process - Notice the 1 hour and 40 minute hang

`

Apr 1, 2020 @ 19:57:56.000 Transport opened
Apr 1, 2020 @ 19:57:56.000 OpenSession: resp=TOpenSessionResp(status=TStatus(errorCode=None, errorMessage=None, sqlState=None, infoMessages=None, statusCode=0), sessionHandle=TSessionHandle(sessionId=THandleIdentifier(secret='\xaa\n\xfb=y\xdaI\t\x98u\r6>Ss0', guid='.>\t\x12\xff\xd9G\xbb\x80\x01U\x05a\x87\xad\x89')), configuration={'RUNTIME_BLOOM_FILTER_SIZE': '1048576', 'DECIMAL_V2': '0', 'QUERY_TIMEOUT_S': '0', 'RM_INITIAL_MEM': '0', 'HBASE_CACHE_BLOCKS': '0', 'SCHEDULE_RANDOM_REPLICA': '0', 'DEFAULT_ORDER_BY_LIMIT': '-1', 'REPLICA_PREFERENCE': '0', 'RUNTIME_FILTER_MODE': '2', 'HBASE_CACHING': '0', 'DISABLE_CODEGEN': '0', 'S3_SKIP_INSERT_STAGING': '1', 'ABORT_ON_ERROR': '0', 'PREFETCH_MODE': '1', 'MAX_ROW_SIZE': '524288', 'PARQUET_FILE_SIZE': '0', 'DISABLE_STREAMING_PREAGGREGATIONS': '0', 'SHUFFLE_DISTINCT_EXPRS': '1', 'COMPRESSION_CODEC': '', 'DISABLE_ROW_RUNTIME_FILTERING': '0', 'DISABLE_OUTERMOST_TOPN': '0', 'MIN_SPILLABLE_BUFFER_SIZE': '65536', 'PARQUET_READ_STATISTICS': '1', 'http_addr': 'c3kuhdpnode7.premierinc.com:25000', 'STRICT_MODE': '0', 'BATCH_SIZE': '0', 'MAX_SCAN_RANGE_LENGTH': '0', 'DISABLE_CODEGEN_ROWS_THRESHOLD': '50000', 'MT_DOP': '', 'RUNTIME_FILTER_MIN_SIZE': '1048576', 'PARQUET_FALLBACK_SCHEMA_RESOLUTION': '0', 'NUM_NODES': '0', 'ALLOW_UNSUPPORTED_FORMATS': '0', 'EXEC_TIME_LIMIT_S': '0', 'RUNTIME_FILTER_MAX_SIZE': '16777216', 'NUM_SCANNER_THREADS': '0', 'SCAN_NODE_CODEGEN_THRESHOLD': '1800000', 'MAX_NUM_RUNTIME_FILTERS': '10', 'EXPLAIN_LEVEL': '1', 'SCRATCH_LIMIT': '-1', 'SEQ_COMPRESSION_MODE': '', 'MAX_ERRORS': '100', 'MAX_IO_BUFFERS': '0', 'COMPUTE_STATS_MIN_SAMPLE_SIZE': '1073741824', 'RUNTIME_FILTER_WAIT_TIME_MS': '0', 'PARQUET_DICTIONARY_FILTERING': '1', 'DEBUG_ACTION': '', 'PARQUET_ARRAY_RESOLUTION': '2', 'DISABLE_UNSAFE_SPILLS': '0', 'MAX_MEM_ESTIMATE_FOR_ADMISSION': '0', 'BUFFER_POOL_LIMIT': '', 'RESERVATION_REQUEST_TIMEOUT': '', 'SYNC_DDL': '0', 'PARQUET_ANNOTATE_STRINGS_UTF8': '0', 'DISABLE_CACHED_READS': '0', 'ABORT_ON_DEFAULT_LIMIT_EXCEEDED': '0', 'APPX_COUNT_DISTINCT': '0', 'V_CPU_CORES': '', 'MEM_LIMIT': '0', 'OPTIMIZE_PARTITION_KEY_SCANS': '0', 'DEFAULT_JOIN_DISTRIBUTION_MODE': '0', 'DEFAULT_SPILLABLE_BUFFER_SIZE': '2097152', 'REQUEST_POOL': '', 'IDLE_SESSION_TIMEOUT': '0', 'EXEC_SINGLE_NODE_ROWS_THRESHOLD': '100', 'ENABLE_EXPR_REWRITES': '0'}, serverProtocolVersion=5)
Apr 1, 2020 @ 19:57:56.000 HiveServer2Cursor(service=<impala.hiveserver2.HS2Service object at 0x7f054b8bed90>, session_handle=TSessionHandle(sessionId=THandleIdentifier(secret='\xaa\n\xfb=y\xdaI\t\x98u\r6>Ss0', guid='.>\t\x12\xff\xd9G\xbb\x80\x01U\x05a\x87\xad\x89')), default_config={'RUNTIME_BLOOM_FILTER_SIZE': '1048576', 'DECIMAL_V2': '0', 'QUERY_TIMEOUT_S': '0', 'RM_INITIAL_MEM': '0', 'HBASE_CACHE_BLOCKS': '0', 'SCHEDULE_RANDOM_REPLICA': '0', 'DEFAULT_ORDER_BY_LIMIT': '-1', 'REPLICA_PREFERENCE': '0', 'RUNTIME_FILTER_MODE': '2', 'HBASE_CACHING': '0', 'DISABLE_CODEGEN': '0', 'S3_SKIP_INSERT_STAGING': '1', 'ABORT_ON_ERROR': '0', 'PREFETCH_MODE': '1', 'MAX_ROW_SIZE': '524288', 'PARQUET_FILE_SIZE': '0', 'DISABLE_STREAMING_PREAGGREGATIONS': '0', 'SHUFFLE_DISTINCT_EXPRS': '1', 'COMPRESSION_CODEC': '', 'DISABLE_ROW_RUNTIME_FILTERING': '0', 'DISABLE_OUTERMOST_TOPN': '0', 'MIN_SPILLABLE_BUFFER_SIZE': '65536', 'PARQUET_READ_STATISTICS': '1', 'http_addr': 'c3kuhdpnode7.premierinc.com:25000', 'STRICT_MODE': '0', 'BATCH_SIZE': '0', 'MAX_SCAN_RANGE_LENGTH': '0', 'DISABLE_CODEGEN_ROWS_THRESHOLD': '50000', 'MT_DOP': '', 'RUNTIME_FILTER_MIN_SIZE': '1048576', 'PARQUET_FALLBACK_SCHEMA_RESOLUTION': '0', 'NUM_NODES': '0', 'ALLOW_UNSUPPORTED_FORMATS': '0', 'EXEC_TIME_LIMIT_S': '0', 'RUNTIME_FILTER_MAX_SIZE': '16777216', 'NUM_SCANNER_THREADS': '0', 'SCAN_NODE_CODEGEN_THRESHOLD': '1800000', 'MAX_NUM_RUNTIME_FILTERS': '10', 'EXPLAIN_LEVEL': '1', 'SCRATCH_LIMIT': '-1', 'SEQ_COMPRESSION_MODE': '', 'MAX_ERRORS': '100', 'MAX_IO_BUFFERS': '0', 'COMPUTE_STATS_MIN_SAMPLE_SIZE': '1073741824', 'RUNTIME_FILTER_WAIT_TIME_MS': '0', 'PARQUET_DICTIONARY_FILTERING': '1', 'DEBUG_ACTION': '', 'PARQUET_ARRAY_RESOLUTION': '2', 'DISABLE_UNSAFE_SPILLS': '0', 'MAX_MEM_ESTIMATE_FOR_ADMISSION': '0', 'BUFFER_POOL_LIMIT': '', 'RESERVATION_REQUEST_TIMEOUT': '', 'SYNC_DDL': '0', 'PARQUET_ANNOTATE_STRINGS_UTF8': '0', 'DISABLE_CACHED_READS': '0', 'ABORT_ON_DEFAULT_LIMIT_EXCEEDED': '0', 'APPX_COUNT_DISTINCT': '0', 'V_CPU_CORES': '', 'MEM_LIMIT': '0', 'OPTIMIZE_PARTITION_KEY_SCANS': '0', 'DEFAULT_JOIN_DISTRIBUTION_MODE': '0', 'DEFAULT_SPILLABLE_BUFFER_SIZE': '2097152', 'REQUEST_POOL': '', 'IDLE_SESSION_TIMEOUT': '0', 'EXEC_SINGLE_NODE_ROWS_THRESHOLD': '100', 'ENABLE_EXPR_REWRITES': '0'}, hs2_protocol_version=5)
Apr 1, 2020 @ 19:57:56.000 Executing query USE database
Apr 1, 2020 @ 19:57:56.000 _execute_async: self._buffer=Batch() self._description=None self._last_operation_active=False self._last_operation=None
Apr 1, 2020 @ 19:57:56.000 _reset_state: Resetting cursor state
Apr 1, 2020 @ 19:57:56.000 _execute_async: self._buffer=Batch() self._description=None self._last_operation_active=False self._last_operation=None
Apr 1, 2020 @ 19:57:56.000 ExecuteStatement: req=TExecuteStatementReq(confOverlay=None, sessionHandle=TSessionHandle(sessionId=THandleIdentifier(secret='\xaa\n\xfb=y\xdaI\t\x98u\r6>Ss0', guid='.>\t\x12\xff\xd9G\xbb\x80\x01U\x05a\x87\xad\x89')), runAsync=True, statement='USE az2090_username')
Apr 1, 2020 @ 19:57:56.000 Attempting to open transport (tries_left=3)
Apr 1, 2020 @ 19:57:56.000 Transport opened
Apr 1, 2020 @ 18:17:56.000 Connecting to HiveServer2 hostname:25003 with PLAIN authentication mechanism
Apr 1, 2020 @ 18:17:56.000 get_socket: host=hostname port=25003 use_ssl=False ca_cert=None
Apr 1, 2020 @ 18:17:56.000 sock=<thrift.transport.TSocket.TSocket instance at 0x7f054b8c1aa0>
Apr 1, 2020 @ 18:17:56.000 get_transport: socket=<thrift.transport.TSocket.TSocket instance at 0x7f054b8c1aa0> host=hostname kerberos_service_name=impala auth_mechanism=PLAIN user=cpiint_np password=fuggetaboutit
Apr 1, 2020 @ 18:17:56.000 transport=<thrift_sasl.TSaslClientTransport instance at 0x7f054b8c1e60> protocol=<thrift.protocol.TBinaryProtocol.TBinaryProtocolAccelerated instance at 0x7f054b8c8140> service=<impala._thrift_gen.ImpalaService.ImpalaHiveServer2Service.Client object at 0x7f054b8bed50>
Apr 1, 2020 @ 18:17:56.000 HiveServer2Connection(service=<impala.hiveserver2.HS2Service object at 0x7f054b8bed90>, default_db=az2090_username)
Apr 1, 2020 @ 18:17:56.000 Getting a cursor (Impala session)
Apr 1, 2020 @ 18:17:56.000 .cursor(): getting new session_handle
Apr 1, 2020 @ 18:17:56.000 OpenSession: req=TOpenSessionReq(username='root', password=None, client_protocol=5, configuration=None)
Apr 1, 2020 @ 18:17:56.000 Attempting to open transport (tries_left=3)
Apr 1, 2020 @ 18:17:56.000 Transport opened

`

DVAlexHiggs commented 3 years ago

Exactly a year later, still getting this issue. It seems to hang consistently with certain queries (which only return ~200 rows tops) which are near-instant using Database query tools such as DBeaver. Other queries work fine, even when they are more complicated and return more records.

Occurs on the fetchall() call inside as_pandas. Happens even if I use a raw_connection and call fetchall() myself. Left it overnight for 8 hours and it was still hanging in the morning. Any help would be appreciated.

I have auth_mechanism='PLAIN' set, and I have also tried it with create_engine in SQLAlchemy, but to no avail, because internally they all use fetchall(). Using PyHive is not an option because unfortunately I'm in a corporate environment on Windows and cannot install any of the libraries, so I can only use Impyla.

Did you ever fix this @darkz1984 ?

EDIT: I seem to have been able to prevent the hang, but at the cost of performance. If I set the cursor's buffersize to 1:

cursor._buffersize = 1

It generally slows down all queries because of the smaller buffer, but it stops hanging. Bear in mind we are dealing with very small queries here (~200 records returned at most), so I'm not sure how this will affect larger queries.

LeeInJae commented 3 years ago

@DVAlexHiggs Hi Alex, I also faced this issue. When I do fetchall() keep stopping without end or error.

I wonder if you solved this problem. Thanks!