baztian / jaydebeapi

JayDeBeApi module allows you to connect from Python code to databases using Java JDBC. It provides a Python DB-API v2.0 to that database.
GNU Lesser General Public License v3.0
366 stars 148 forks source link

Allow additional JAVA_OPTS #19

Open jethrow opened 8 years ago

jethrow commented 8 years ago

I have confirmed jadebeapi works with Hive. However, due to Kerberos authentication, I had to start the JVM separately to provide additional options. I suggest adding an additional parameter to__jdbc_connectjpype/connect to allow for a list of JAVA_OPTS.

jtbirdsell commented 8 years ago

Having a similar issue, what options did you have to use to make this possible?

jethrow commented 8 years ago

I created a startJVM function based on the _jdbc_connect_jpype function and call it in my script:

def startJVM(jarList=None, optList=None):
    # https://github.com/baztian/jaydebeapi/blob/master/jaydebeapi/__init__.py
    import jpype
    args = []
    class_path = []
    if jarList:
        class_path.extend(jarList)
    class_path.extend([os.environ["CLASSPATH"]])
    if class_path:
        args.append('-Djava.class.path=' + os.path.pathsep.join(class_path))
    # if libs:
    #     # path to shared libraries
    #     libs_path = os.path.pathsep.join(libs)
    #     args.append('-Djava.library.path=%s' % libs_path)
    if optList:
        args.extend(optList)

    jvm_path = jpype.getDefaultJVMPath()
    jpype.startJVM(jvm_path, *args)

startJVM(jars, ["-Djava.security.krb5.realm="+realm, "-Djava.security.krb5.kdc="+kdc])

baztian commented 7 years ago

Another (workaround) option is to call jpype.startJVM before calling ´jaydebeapi.connect()`. But you are right: There should be a parameter to pass. So I leave this issue open.

jbraun11 commented 7 years ago

I tried adding the krb5 properties to the jpype JVM but I still get the error Could not open client transport with JDBC Uri: jdbc:hive2://host:port/db;principal=: GSS initiate failed. Any ideas on the issue?

import jaydebeapi
import jpype

jvmPath = jpype.getDefaultJVMPath()
jpype.startJVM(jvmPath, "-Djava.ext.dirs=/usr/hdp/2.4.3.0-227/hive/lib:/usr/hdp/2.4.3.0-227/hadoop/client", "-Djava.security.krb5.realm=<realm>", "-Djava.security.krb5.kdc=<kdc>")

conn = jaydebeapi.connect("org.apache.hive.jdbc.HiveDriver",
                          "jdbc:hive2://host:port/db;principal=<principal>",
                          [], "",)
whummer commented 7 years ago

+1 we need to pass in logging configuration as a JVM system property, would be great to have a config parameter for it.

baztian commented 7 years ago

Allowing additional JAVA_OPTS would only work for JPype and not for Jython. Supplying system variables (-Dkey=value) could work for both using System.setProperty method. I'm thinking of a additional dictionary parameter to the connect method. What do you think?

huguetpj commented 6 years ago

hi. Just in case anybody else falls into this page through google like I did, this is how I made it work in Python 2.7, using Hadoop with kerberos and HA. This code works after the user has done kinit.

# initialize connection params
    driverclass = "org.apache.hive.jdbc.HiveDriver"
    url = ""
    params = {}
    jar = ""

print "setting up for Linux"
        url = ("jdbc:hive2://M1.DEV.local:PPP,M2.DEV.local:PPP,M3.DEV.local:PPP/;"
               "serviceDiscoveryMode=zooKeeper;"
               "zooKeeperNamespace=hiveserver2;"
               "transportMode=http;"
               "httpPath=cliservice;"
               "principal=hive/_HOST@DEV.LOCAL;"
               "hive.server2.proxy.user=" + user + ";"
               )
        jar = "/path/to/hive-jdbc-1.2.1000.2.6.2.0-205-standalone.jar"

        # this is needed to work with kerberos impersonation
        # need to start JVM with useSubjectCredsOnly=false, before starting jaydebeapi.connect
        args = '-Djava.class.path=%s' % jar
        jvm_path = jpype.getDefaultJVMPath()
        jpype.startJVM(jvm_path, args, '-Djavax.security.auth.useSubjectCredsOnly=false')

conn = jaydebeapi.connect(driverclass,
                              url,
                              params,
                              jar, )
antonioshadji commented 5 years ago

worked for me with same url used by beeline cli and python 3.5.

FeatCrush commented 5 years ago

I worked on the connect() method with Jpype to facilitate authentication with Kerberos

116