jpype-project / jpype

JPype is cross language bridge to allow Python programs full access to Java class libraries.
http://www.jpype.org
Apache License 2.0
1.11k stars 179 forks source link

Apache Drill JDBC driver only loaded on second attempt #913

Open xhochy opened 3 years ago

xhochy commented 3 years ago

Using the Apache Drill JDBC driver always only works with the second attempt to connect to it. The first attempt will always fails with the same missing Java class.

Reproducible example

# wget http://archive.apache.org/dist/drill/drill-1.18.0/apache-drill-1.18.0.tar.gz
# tar xf apache-drill-1.18.0.tar.gz

import os

import jpype
import jpype.dbapi2

classpath = os.path.join(os.getcwd(), "apache-drill-1.18.0/jars/jdbc-driver/drill-jdbc-all-1.18.0.jar")
jpype.startJVM(classpath=classpath)

jpype.dbapi2.connect("jdbc:drill:drillbit=127.0.0.1")

This errors with the following trace for me. A second call to jpype.dbapi2.connect(…) suceeds though.

java.lang.ClassNotFoundException          Traceback (most recent call last)
…/env/lib/python3.8/site-packages/_jpype.cpython-38-darwin.so in java.lang.ClassLoader.loadClass()

…/env/lib/python3.8/site-packages/_jpype.cpython-38-darwin.so in sun.misc.Launcher$AppClassLoader.loadClass()

…/env/lib/python3.8/site-packages/_jpype.cpython-38-darwin.so in java.lang.ClassLoader.loadClass()

…/env/lib/python3.8/site-packages/_jpype.cpython-38-darwin.so in java.net.URLClassLoader.findClass()

java.lang.ClassNotFoundException: java.lang.ClassNotFoundException: oadd.org.apache.drill.exec.store.StoragePluginRegistry

The above exception was the direct cause of the following exception:

Exception                                 Traceback (most recent call last)
…/env/lib/python3.8/site-packages/_jpype.cpython-38-darwin.so in org.jpype.JPypeContext.newWrapper()

…/env/lib/python3.8/site-packages/_jpype.cpython-38-darwin.so in org.jpype.manager.TypeFactoryNative.newWrapper()

…/env/lib/python3.8/site-packages/_jpype.cpython-38-darwin.so in org.jpype.manager.TypeManager.populateMembers()

…/env/lib/python3.8/site-packages/_jpype.cpython-38-darwin.so in org.jpype.manager.TypeManager.createMembers()

…/env/lib/python3.8/site-packages/_jpype.cpython-38-darwin.so in org.jpype.manager.TypeManager.createMethodDispatches()

…/env/lib/python3.8/site-packages/_jpype.cpython-38-darwin.so in java.lang.Class.getDeclaredMethods()

…/env/lib/python3.8/site-packages/_jpype.cpython-38-darwin.so in java.lang.Class.privateGetDeclaredMethods()

…/env/lib/python3.8/site-packages/_jpype.cpython-38-darwin.so in java.lang.Class.getDeclaredMethods0()

Exception: Java Exception

The above exception was the direct cause of the following exception:

java.lang.NoClassDefFoundError            Traceback (most recent call last)
<ipython-input-1-c05c50c246ba> in <module>
     10 jpype.startJVM(classpath=classpath)
     11 
---> 12 jpype.dbapi2.connect("jdbc:drill:drillbit=127.0.0.1")

…/env/lib/python3.8/site-packages/jpype/dbapi2.py in connect(dsn, driver, driver_args, adapters, converters, getters, setters, **kwargs)
    418     # User supplied nothing
    419     elif driver_args is None:
--> 420         connection = DM.getConnection(dsn)
    421 
    422     # Otherwise use the kwargs

java.lang.NoClassDefFoundError: java.lang.NoClassDefFoundError: oadd/org/apache/drill/exec/store/StoragePluginRegistry
Thrameos commented 3 years ago

It appears to be an issue with reflection. JPype is in the process of loading a requested class when Java itself produced an error.

  File "ClassLoader.java", line 522, in java.lang.ClassLoader.loadClass
  File "ClassLoaders.java", line 178, in jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass
  File "BuiltinClassLoader.java", line 581, in jdk.internal.loader.BuiltinClassLoader.loadClass
java.lang.ClassNotFoundException: java.lang.ClassNotFoundException: oadd.org.apache.drill.exec.store.StoragePluginRegistry

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "org.jpype.JPypeContext.java", line -1, in org.jpype.JPypeContext.newWrapper
  File "org.jpype.manager.TypeFactoryNative.java", line -2, in org.jpype.manager.TypeFactoryNative.newWrapper
  File "org.jpype.manager.TypeManager.java", line -1, in org.jpype.manager.TypeManager.populateMembers
  File "org.jpype.manager.TypeManager.java", line -1, in org.jpype.manager.TypeManager.createMembers
  File "org.jpype.manager.TypeManager.java", line -1, in org.jpype.manager.TypeManager.createMethodDispatches  <=== call to getDeclaredMethods
  File "Class.java", line 2309, in java.lang.Class.getDeclaredMethods
  File "Class.java", line 3166, in java.lang.Class.privateGetDeclaredMethods
  File "Class.java", line -2, in java.lang.Class.getDeclaredMethods0
Exception: Java Exception

I looked over the apache drill jar that is being loaded and I do not see the requested class in the manifest. So apparently the drill-all must contain a class which is not found.

Looking over the contents of the jar it appears that it is used Java assist, so likely this is a case of dynamically created classes. Most likely they intend for the classes to be loaded with a custom classloader which will dynamically generate the required class.

I added a diagnostic line to see which class is failing.

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by oadd.javassist.util.proxy.SecurityActions (file:/mnt/c/Users/nelson85/Documents/devel/open/jpype2/apache-drill-1.18.0/jars/jdbc-driver/drill-jdbc-all-1.18.0.jar) to method java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int,java.security.ProtectionDomain)
WARNING: Please consider reporting this to the maintainers of oadd.javassist.util.proxy.SecurityActions
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Create dispatchesclass org.apache.drill.jdbc.impl.DrillConnectionImpl <=== failure point
Create dispatchesclass java.lang.NoClassDefFoundError
Create dispatchesclass java.lang.LinkageError
Create dispatchesclass java.lang.Error
Create dispatchesclass java.lang.ClassNotFoundException
Create dispatchesclass java.lang.ReflectiveOperationException
Traceback (most recent call last):
  File "ClassLoader.java", line 522, in java.lang.ClassLoader.loadClass
  File "ClassLoaders.java", line 178, in jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass
  File "BuiltinClassLoader.java", line 581, in jdk.internal.loader.BuiltinClassLoader.loadClass
java.lang.ClassNotFoundException: java.lang.ClassNotFoundException: oadd.org.apache.drill.exec.store.StoragePluginRegistry

So they are doing some very bad things here. They have opened via reflection the classloader and requested to define a class (rather than delegating to a different custom classloader).

Unfortunately this is not really a JPype bug in that JPype did normal access requests for reflection and something within drill failed to function.

To demonstrate this the following code replicates the problem without the use of JPype.

public class Test
{
        public static void main(String[] args) throws ClassNotFoundException
        {
                Class cls = Class.forName("org.apache.drill.jdbc.impl.DrillConnectionImpl");
                System.out.println(cls.getDeclaredMethods());
        }
}

Compile and hhen execute it with...

java -cp apache-drill-1.18.0/jars/jdbc-driver/drill-jdbc-all-1.18.0.jar:. Test

I recommend bouncing this one upstream to Apache drill. I am not going to be able to do much if the jar is sufficiently broken that basic reflection is failing. Either this is a dynamic class that needed to be created in advance or there is a dependency missing from the jar.

Thrameos commented 3 years ago

I am also afraid that changes related to #928 are likely going to exacerbate this problem. Hopefully there is still a way to get this driver loaded after bug fix.

vvysotskyi commented 3 years ago

The issue is in Drill JDBC Driver, but it is not related to Javassist. DrillConnectionImpl from the driver should not use StoragePluginRegistry class, I'll fix it soon.

@xhochy, I've tried running the code you have shared with the custom version of the driver, but it fails with the following exception:

Traceback (most recent call last):
  File "DrillConnectionImpl.java", line 222, in org.apache.drill.jdbc.impl.DrillConnectionImpl.setAutoCommit
Exception: Java Exception

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/vova/PycharmProjects/jpype-drill/main.py", line 15, in <module>
    jpype.dbapi2.connect("jdbc:drill:drillbit=127.0.0.1")
  File "/Users/vova/PycharmProjects/jpype-drill/venv/lib/python3.8/site-packages/jpype/dbapi2.py", line 429, in connect
    return Connection(connection, adapters, converters, setters, getters)
  File "/Users/vova/PycharmProjects/jpype-drill/venv/lib/python3.8/site-packages/jpype/dbapi2.py", line 455, in __init__
    self._jcx.setAutoCommit(False)
java.sql.SQLFeatureNotSupportedException: java.sql.SQLFeatureNotSupportedException: Can't turn off auto-committing; transactions are not supported.  (Drill is not transactional.)

Drill doesn't support transactions, so setting the autoCommit value to false throws an exception. Are there any additional changes in the python script you have attached that should be done to avoid this issue to check Driver functionality? If I understand correctly, it is expected that this method shouldn't do anything according to Python Database API Specification v2.0.

Thrameos commented 3 years ago

Thanks so much for taking a look at this. Javaassist was just a guess as I couldn't find the plugin registry anywhere in the distribution so my best guess was a generated class.

If you would like to do further tests with JPype I would recommend looking over the test benches that we have various databases under test/jpypetest/test_sql*. Currently we test on h2, mysql, and hypersql simply because those had instructions on setting up "in memory" sql support. I am not a regular database user, but I did try to follow the Python Database specification. There is a lot of ambiguities so not sure how well it meets the standard. The dbapi2 standard is really weak at defining which exceptions are thrown so drivers are very inconsistent on that front.

If you want to modify the h2 or hypersql tests to apply to drill we can add it to our supported drivers. Procedures are simple. Add the required jars to the ivy.xml specification and call resolve.sh, copy from an existing test bench to test/jpypetest/test_sql_drill.py and set the configuration string for an in memory database, modify the tests for drill such that capabilities (and lack of capabilities) are tested. Test bench is run using python -m pytest test/jpypetest/test_sql_drill.py. It is not necessary to get all the tests to pass, just make to test for the correct behavior, and submit a PR as I can deal with behavioral fixes in jpype.dbapi2 to get the tests to pass.

vvysotskyi commented 3 years ago

Hi @Thrameos, thanks for referring to existing tests and the insights on how to run create, and run tests. Unfortunately, Drill cannot be run in-memory, using only the JDBC driver. I've tried to run tests locally, but they fail because cannot find the required driver (the same issue for existing tests) though resolve.sh was executed and required jars present in the lib folder.

Anyway, I've made changes in Drill, so now it is possible to connect to it through the JPype, submit queries, and obtain results.

cgivre commented 3 years ago

@Thrameos Would you be able to take a look at https://github.com/apache/drill/pull/2158?

Thrameos commented 3 years ago

Sure should be able to look it over this week.

Thrameos commented 3 years ago

@vvysotskyi hmm it should run if executed at the top level. If run from other locations you would have to give a classpath argument to pytest.

~/devel/open/jpype-release$ python3 -m pytest test/jpypetest/test_sql_h2.py
=========================================================================== test session starts ============================================================================
platform linux -- Python 3.6.9, pytest-4.6.9, py-1.9.0, pluggy-0.13.1 -- /usr/bin/python3
cachedir: .pytest_cache
rootdir: /mnt/c/Users/nelson85/Documents/devel/open/jpype-release, inifile: setup.cfg
collected 90 items

test/jpypetest/test_sql_h2.py::ConnectTestCase::testAdapters PASSED                                                                                                  [  1%]
test/jpypetest/test_sql_h2.py::ConnectTestCase::testClose PASSED                                                                                                     [  2%]
test/jpypetest/test_sql_h2.py::ConnectTestCase::testConnect PASSED                                                                                                   [  3%]
test/jpypetest/test_sql_h2.py::ConnectTestCase::testConverters PASSED
...