baztian / jaydebeapi

JayDeBeApi module allows you to connect from Python code to databases using Java JDBC. It provides a Python DB-API v2.0 to that database.
GNU Lesser General Public License v3.0
366 stars 149 forks source link

jaydebeapi working with (1) SQL Server, (2) MySQL - with some type fixes #6

Open RudolfCardinal opened 9 years ago

RudolfCardinal commented 9 years ago

Many thanks for writing this library; very helpful! I have it working with Microsoft SQL Server and MySQL under Linux (Ubuntu). However, it needs additional type converters for BIGINT and BLOB/BINARY types, from what I've tested so far. Full source at https://github.com/RudolfCardinal/pythonlib/blob/master/rnc_db.py . The methods are:

import jaydebeapi

# Reconfigure jaydebeapi

def _rnc_to_binary(rs, col):
    # https://github.com/originell/jpype/issues/71
    # http://stackoverflow.com/questions/5088671
    java_val = rs.getObject(col)
    if java_val is None:
        return
    t = str(type(java_val))
    #logger.info("rnc_to_binary: typeof={}".format(t))
    if t == "<class 'jpype._jarray.byte[]'>":
        return ''.join(map(lambda x: chr(x % 256), java_val))
    logger.warning("Unknown type to _rnc_to_binary: {}".format(t))
    return java_val  # unsure

def reconfigure_jaydebeapi():
    # http://stackoverflow.com/questions/26899595
    from jaydebeapi.dbapi2 import _DEFAULT_CONVERTERS, _java_to_py
    _DEFAULT_CONVERTERS.update({
        'BIGINT': _java_to_py('longValue'),
        # RNC experimental:
        'BINARY': _rnc_to_binary,  # overrides an existing one
        'BLOB': _rnc_to_binary,
        'LONGVARBINARY': _rnc_to_binary,
        'VARBINARY': _rnc_to_binary,
    })

reconfigure_jaydebeapi()

# MySQL prerequisites

MYSQL_JDBC_ERROR_HELP = """

    If you get:
        java.lang.RuntimeException: Class com.mysql.jdbc.Driver not found
    ... then, under Ubuntu/Debian, try:
    (1) sudo apt-get install libmysql-java
    (2) export CLASSPATH=$CLASSPATH:/usr/share/java/mysql.jar

    If you get:
        Failed to connect. OSError: [Errno 2] No such file or directory:
        '/usr/lib/jvm'
    ... under 64-bit Ubuntu, then:
        sudo apt-get install default-jre libc6-i386

"""

# MySQL connection

autocommit = True  # or False
jclassname = "com.mysql.jdbc.Driver"
url = "jdbc:mysql://{host}:{port}/{database}".format(
    host=host, port=port, database=database)
driver_args = [url, user, password]
jars = None
libs = None
db = jaydebeapi.connect(jclassname, driver_args, jars=jars,
                                         libs=libs)
db.jconn.setAutoCommit(autocommit)

# SQL Server prerequisites

SQLSERVER_JDBC_ERROR_HELP = """

    If you get:
        java.lang.RuntimeException: Class
            com.microsoft.sqlserver.jdbc.SQLServerDriver not found
    ... then, under Ubuntu/Debian, try:
    (1) Download the driver from
        http://www.microsoft.com/en-us/download/details.aspx?id=11774
        ... it's sqljdbc_4.1.5605.100_enu.tar.gz
    (2) [sudo] tar xvzf sqljdbc_4.1.5605.100_enu.tar.gz [-C destdir]
    (3) export CLASSPATH=$CLASSPATH:/wherever/sqljdbc_4.1/enu/sqljdbc41.jar

"""

# SQL Server connection

jclassname = "com.microsoft.sqlserver.jdbc.SQLServerDriver"
db_elem = ";databaseName={}".format(database) if database else ""
url = (
    "jdbc:sqlserver://{host}:{port}{db_elem}"
    ";user={user};password={password}".format(
        host=host, port=port, db_elem=db_elem,
        user=user, password=password)
)
driver_args = [url]
jars = None
libs = None
db = jaydebeapi.connect(jclassname, driver_args, jars=jars,
                        libs=libs)
db.jconn.setAutoCommit(autocommit)
baztian commented 9 years ago

Thanks for sharing. It would be nice to have the mapping already provided by jaydebeapi out of the box so you could avoid some plumbing. Your Bigint mapping modification should probably work for all databases. Not sure about the binary stuff though. Especially it seems to be tied to JPype so we'll first find a more general approach that's compatible with Jython as well.

Very interesting to see people using jaydebeapi in their projects even though I don't really get what your project is doing ;). Glad it's helpful for you.

baztian commented 9 years ago

I'm not sure whether your BIGINT conversion works in any case. According to https://docs.oracle.com/javase/7/docs/api/java/math/BigInteger.html#longValue%28%29 it might loses some information on very big ints.

One other possibility would be to use BigInteger's toString method and convert to pyhton int

RudolfCardinal commented 9 years ago
  1. I've updated the converters (below), including a ~45-fold speedup for binary data. The logging calls presuppose a Python logger called 'logger', and some imports are also presupposed.
  2. When accessing databases having VARBINARY(MAX) fields, via the Microsoft SQL Server JDBC Driver 4.1 (via jpype via jaydebeapi), data-fetching functions crash when they exit (as the database cursor they own goes out of scope and is deleted). Internally, as determined by attaching the gdb debugger, the crash is at a recv() call, itself called by Java.java.net.SocketInputStream.socketRead0(). This one took me a while to understand and work around! This crash can be averted by passing the "responseBuffering=adaptive" setting when opening the JDBC connection, e.g. via the connection URL; see https://msdn.microsoft.com/en-us/library/ms378988(SQL.90).aspx .

The project is an anonymisation tool, by the way: https://github.com/RudolfCardinal/anonymise

Converters:

def _convert_java_binary(rs, col):
    # https://github.com/originell/jpype/issues/71
    # http://stackoverflow.com/questions/5088671
    # https://github.com/baztian/jaydebeapi/blob/master/jaydebeapi/__init__.py
    # https://msdn.microsoft.com/en-us/library/ms378813(v=sql.110).aspx
    # http://stackoverflow.com/questions/2920364/checking-for-a-null-int-value-from-a-java-resultset  # noqa

    v = None
    logger.debug("_convert_java_binary: converting...")
    time1 = time.time()
    try:
        # ---------------------------------------------------------------------
        # Method 1: 3578880 bytes in 21.7430660725 seconds =   165 kB/s
        # ---------------------------------------------------------------------
        # java_val = rs.getObject(col)
        # if java_val is None:
        #     return
        # t = str(type(java_val))
        # if t == "<class 'jpype._jarray.byte[]'>": ...
        # v = ''.join(map(lambda x: chr(x % 256), java_val))

        # ---------------------------------------------------------------------
        # Method 2: 3578880 bytes in 8.07930088043 seconds =   442 kB/s
        # ---------------------------------------------------------------------
        # java_val = rs.getObject(col)
        # if java_val is None:
        #     return
        # l = len(java_val)
        # v = bytearray(l)
        # for i in xrange(l):
        #     v[i] = java_val[i] % 256

        # ---------------------------------------------------------------------
        # Method 3: 3578880 bytes in 20.1435189247 seconds =   177 kB/s
        # ---------------------------------------------------------------------
        # java_val = rs.getObject(col)
        # if java_val is None:
        #     return
        # v = bytearray(map(lambda x: x % 256, java_val))

        # ---------------------------------------------------------------------
        # Method 4: 3578880 bytes in 0.48352599144 seconds = 7,402 kB/s
        # ---------------------------------------------------------------------
        j_hexstr = rs.getString(col)
        if rs.wasNull():
            return
        v = binascii.unhexlify(j_hexstr)

    finally:
        time2 = time.time()
        logger.debug("... done (in {} seconds)".format(time2 - time1))
        return v

def _convert_java_bigstring(rs, col):
    v = str(rs.getCharacterStream(col))
    if rs.wasNull():
        return None
    return v

def _convert_java_bigint(rs, col):
    # http://stackoverflow.com/questions/26899595
    # https://github.com/baztian/jaydebeapi/issues/6
    # https://github.com/baztian/jaydebeapi/blob/master/jaydebeapi/__init__.py
    # https://docs.oracle.com/javase/7/docs/api/java/math/BigInteger.html
    # http://docs.oracle.com/javase/7/docs/api/java/sql/ResultSet.html
    java_val = rs.getObject(col)
    if java_val is None:
        return
    v = getattr(java_val, 'toString')()  # Java call: java_val.toString()
    return int(v)

def reconfigure_jaydebeapi():
    from jaydebeapi.dbapi2 import _DEFAULT_CONVERTERS
    _DEFAULT_CONVERTERS.update({
        'BIGINT': _convert_java_bigint,

        'BINARY': _convert_java_binary,  # overrides an existing one
        'BLOB': _convert_java_binary,
        'LONGVARBINARY': _convert_java_binary,
        'VARBINARY': _convert_java_binary,

        'LONGVARCHAR': _convert_java_bigstring,
        'LONGNVARCHAR': _convert_java_bigstring,
    })

reconfigure_jaydebeapi()
baztian commented 9 years ago

Thank's a lot. I'll try to have a closer look very soon and try if I can integrate (and unit test) that. Hopefully this will be inter db compatible... One thing: I guess for Python < 3 you should use long(v) for bigint conversion.

RudolfCardinal commented 9 years ago

Thanks. Possibly, but only for antique versions: https://www.python.org/dev/peps/pep-0237/

baztian commented 9 years ago

@RudolfCardinal, can you please check if the changes made in https://github.com/baztian/jaydebeapi/tree/feature/advanced-datatypes are working for you? If that's possible, please report back against which databases and newly supported datatypes you've tested and if you're using Python or Jython? I didn't find the time yet to implement some more decent tests.

BTW: If you fancy I would be really grateful if you can contribute something to https://github.com/baztian/dbapi-compliance/ . I've started a a jaydebeapi sepcific integration test using postgresql in https://github.com/baztian/dbapi-compliance/tree/jaydebeapi_postgres.

PS: Will be very busy the next two weeks and might not respond in that time.

baztian commented 9 years ago

@RudolfCardinal I've tried the LONGVARCHAR mapping on Oracle. I get a column string value like this one: 'oracle.jdbc.driver.OracleConversionReader@64bfbc86' When I change to _to_bigstr to

def _to_bigstr(rs, col):
    return rs.getString(col)

I get the string value. Also this method would work to map a CLOB column.

Did you find the time to try the branch I've mentioned?

RudolfCardinal commented 9 years ago

Dear Bastian,

OK - I'm afraid I don't have access to Oracle (I've been using MySQL and SQL Server).

I'm not sure how to create test suites for new engines (following git clone https://github.com/baztian/dbapi-compliance and git checkout jaydebeapi_postgres, I see that one should run test_jaydebeapi_postgres_dbapi20.py). Is the idea to create a temporary table, stuff it with content via raw SQL and/or the interface, and check what comes back out? I guess a method/engine test grid is required, since there seem to be a few methods that might work. Is the Oracle difference because of something nonstandard about the various JDBC interfaces?

kervel commented 8 years ago

Hello,

for SAS IOM JDBC driver, getInt() or getObject() on integers will truncate long integers. I changed the convertor to getLong() for type INTEGER to make things work.

greetings, Frank

sunilag commented 6 years ago

Hello,

When i'm using with MSSQL , returns me this error for query result which has date time stamp <jpype._jclass.microsoft.sql.DateTimeOffset object at 0x7f7eb26e4ad0>

The query result has TimeStamp with value "2017-09-21 12:23:29.000 -05:00"

Could you please help /advice.