dropbox / PyHive

Python interface to Hive and Presto. 🐝
Other
1.67k stars 550 forks source link

PyHive and Transport mode - HTTP #69

Open gregorysuarez opened 8 years ago

gregorysuarez commented 8 years ago

Our server is configured with hive.server2.transport.mode set to HTTP. When switching to binary , everything seems to work perfectly. Is there a way to enable PyHive to work with HTTP transport mode ?

jingw commented 8 years ago

Sorry, there is no code for HTTP mode.

zscalerspark commented 7 years ago

Any timeline/priority when this will be available

jrevillard commented 6 years ago

I'm very interested into this functionality too as I must start using the HTTP mode for Apache Knox. Is it something that can be planned ?

jbreija commented 6 years ago

Me too, I must use Apache Knox to authenticate with Hive. Is this currently possible with PyHive?

Polar-is commented 6 years ago

Likewise, I'd also be interested in this functionality!

rohit-menon commented 5 years ago

Any updates on HTTP support?

Neelotpaul commented 5 years ago

Please add HTTP support. As i am trying to use the code piece in prod environment where other teams are also connecting through JDBC, pyspark; i am not able to set it binary if that makes other connections not work.

rashmigulhane commented 5 years ago

+1 .. Can you please add Http Support for Pyhive

modeyang commented 4 years ago

+1

joaopedroantonio commented 4 years ago

Hi all, just created a PR to add support for Thrift connections over HTTP transport. You can follow its progress here: https://github.com/dropbox/PyHive/pull/325

pauldevos commented 4 years ago

Our server is configured with hive.server2.transport.mode set to HTTP. When switching to binary , everything seems to work perfectly. Is there a way to enable PyHive to work with HTTP transport mode ?

So as an end user vs someone configuring the Hive datastore, this is likely not possible via my connection string.

e.g.

con = hive.Connection(host = hive_host, port = 10000, username = hive_username, auth='NOSASL') Docstring for Hive Connection:

Init signature:
hive.Connection(
    host=None,
    port=None,
    username=None,
    database='default',
    auth=None,
    configuration=None,
    kerberos_service_name=None,
    password=None,
    thrift_transport=None,
)
Docstring:      Wraps a Thrift session
Init docstring:
Connect to HiveServer2

:param host: What host HiveServer2 runs on
:param port: What port HiveServer2 runs on. Defaults to 10000.
:param auth: The value of hive.server2.authentication used by HiveServer2.
    Defaults to ``NONE``.
:param configuration: A dictionary of Hive settings (functionally same as the `set` command)
:param kerberos_service_name: Use with auth='KERBEROS' only
:param password: Use with auth='LDAP' or auth='CUSTOM' only
:param thrift_transport: A ``TTransportBase`` for custom advanced usage.
    Incompatible with host, port, auth, kerberos_service_name, and password.

The way to support LDAP and GSSAPI is originated from cloudera/Impyla:
https://github.com/cloudera/impyla/blob/255b07ed973d47a3395214ed92d35ec0615ebf62
/impala/_thrift_api.py#L152-L160
File:           ~/miniconda3/envs/spark_2_4_4/lib/python3.8/site-packages/pyhive/hive.py
Type:           type
Subclasses:   
RamakrishnaChilaka commented 3 years ago

@pauldevos , Thanks a lot for the above snippet. Can we log all the HTTP headers in spark thrift server ?

danjampro commented 1 year ago

Hi, are there any updates on this?