cloudera / impyla

Python DB API 2.0 client for Impala and Hive (HiveServer2 protocol)
Apache License 2.0
727 stars 248 forks source link

set impyla.dbapi.connect method's "timeout" parameter seems to be not affect #278

Open xunux opened 6 years ago

xunux commented 6 years ago

problem: i have set timeout param in the connect method, but it's seems to be not affect

my code

from impala.dbapi import connect
with connect(host='host', port=port, database='database', auth_mechanism='PLAIN', user='root', timeout=300) as conn:
    with conn.cursor() as cur:
        mlog.info("execute hive sql: %s",hive_sql)
        s_time = time.time()
        cur.execute(hive_sql)
        e_time = time.time()
        mlog.info("execute hive sql: %s finished, use time: %s sec", hive_sql, e_time-s_time)

some times the sql's execution time over 300 seconds。

execute hive sql: select server_id, count(*) from xxx where log_date='xxx' group by server_id finished, use time: 1070.25791502 sec

what's the meaning of timeout parameter?

boralyl commented 5 years ago

I think the timeout is only for getting a connection. From the code there is no way to set a timeout on a query. You would have to poll for the status and timeout yourself in your own code.