cloudera / impyla

Python DB API 2.0 client for Impala and Hive (HiveServer2 protocol)
Apache License 2.0
725 stars 248 forks source link

specify retry times for rpc #82

Open allenlz opened 9 years ago

allenlz commented 9 years ago

retry times is hard coded here: https://github.com/cloudera/impyla/blob/v0.10.0/impala/_rpc/hiveserver2.py#L127

Sometimes, user want to retry 0 times if failed, because retry won't help due to some error like memory limit. Would you expose the retry times to dbapi?

Thanks.

laserson commented 9 years ago

Sorry for the long delay. Will look into this shortly.

szehon commented 8 years ago

I hit this issue too.

For information, if Impyla is run against Hive and timeout error occurs, retrying an executeStatement is not useful, as HiveServer2 behavior is to automatically cleanup a session on a disconnect. Thus the retried executeStatement will get :

"org.apache.hive.service.cli.HiveSQLException: Invalid SessionHandle: SessionHandle"

In fact, the implicit retry is harmful in this case as it makes it so that the application has no indication that the error happened and the need to open a new session.