PySimpleSQL / pysimplesql

SQLite3 binding for PySimpleGUI
GNU Lesser General Public License v3.0
47 stars 16 forks source link

MS Access SQLDriver #244

Closed PySimpleSQL closed 1 year ago

PySimpleSQL commented 1 year ago

Just creating this as a separate ticket, as this is one of the most complicated one yet and I want to be sure to document why some decisions have been made.

The issue with Access databases is in supporting multiple Operating Systems. Windows has an ODBC driver specifically for Access databases, but it is a Windows library only.

Linux has as MDBTools library which includes an ODBC driver that can connect to some Access databases, but in my experiments it is very very limited. It can't query the MSys* tables, which is needed to get things like primary key information, column information, etc. A lot of the ODBC standard stuff just doesn't work at all and in my tests it crashed A LOT (though I could successfully read from the database). There are several companies that make excellent ODBC drivers for Linux to interface Access databases, but they are very expensive (mostly around $1000 per year licensing fees). Below are two that I experimented with: https://www.cdata.com/kb/tech/access-odbc-python-linux.rst https://www.easysoft.com/products/data_access/odbc-access-driver/index.html

I've spent a good couple of months trying a lot of different things, and was almost at the point of writing my own ODBC driver. I just happened to run across an excellent Java library called Jackcess that has all of the functionality needed, plus it's cross platform by nature. Jackcess is a very mature and stable Java library for reading and writing to/from MS Access databases. See: https://github.com/jahlborn/jackcess https://sourceforge.net/projects/jackcess/

My plan is to wrap the Java Jackcess library in Python using the jpype Python library. I will wrap only what is needed to meet the definition of our SQLDriver class.

While not the most elegant solution, it seems to be the best solution to provide this functionality across all operating systems at this time.

I have a ton of legacy code/applications that use MS Access and having a stable way to migrate these is pretty high on my priority list!

ssweber commented 1 year ago

Which java version are you using to test?

PySimpleSQL commented 1 year ago

I'm just using open-jdk11

Things are working surprisingly well! No stability issues at all!

On Tue, Apr 4, 2023, 2:29 PM ssweber @.***> wrote:

Which java version are you using to test?

— Reply to this email directly, view it on GitHub https://github.com/PySimpleSQL/pysimplesql/issues/244#issuecomment-1496417192, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQX2REW5LCBWMR52WXMFSY3W7RSAJANCNFSM6AAAAAAWQTKYR4 . You are receiving this because you authored the thread.Message ID: @.***>

ssweber commented 1 year ago

Got it working :)

It doesn’t like my lazy duplicate statement create temporary table as (select * …) so I’ll need to look into that.

also will create a test db to make sure the recursive delete works.

PySimpleSQL commented 1 year ago

Awesome. I just did a pull request that'll fix the missing library. There is still a lot to do on that branch yet. I worked all day on code that can automatically install Java as well

On Wed, Apr 5, 2023, 1:37 PM ssweber @.***> wrote:

Got it working :)

  • added JAVA_HOME to my environmental variables (path/to/Java/bin)
  • the jar files from ucanaccess weren’t uploaded to GitHub, so downloaded 5.0.1 release.

It doesn’t like my lazy duplicate statement create temporary table as (select * …) so I’ll need to look into that.

also will create a test db to make sure the recursive delete works.

— Reply to this email directly, view it on GitHub https://github.com/PySimpleSQL/pysimplesql/issues/244#issuecomment-1497872694, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQX2REW3C2Q56WP6YQXQ36LW7WUUJANCNFSM6AAAAAAWQTKYR4 . You are receiving this because you authored the thread.Message ID: @.***>

Thrameos commented 1 year ago

Hope your use of JPype is going well. If you have any problems be sure to put in an issue request or just tag me in your thread. I think that a number of people just use the anaconda installation to get the Java and Python requirements. Though I would have to defer to others as I don't do much with the distribution and install process.

ssweber commented 1 year ago

I’ve also read about Micromamba as light-weight way of pulling down dependencies.

@Thrameos - that’s some Jedi magic you have there! Do you use a GitHub watcher for JPype mentions?

Thrameos commented 1 year ago

I just run https://github.com/search?o=desc&q=jpype&s=updated&type=Issues in the morning to see if there is anything that looks like unreported issues. Helps to have a very uniquely named project.

PySimpleSQL commented 1 year ago

@Thrameos Thanks for dropping in!

My only struggle with jpype was in casting (especially date, time, datetime, timestamp) - I can't remember off the top of my head, but I was having comparisons with standard python types fail. Is there a way to have the nearest native python types returned?

Jpype is a super cool library, I'm amazed at how well it works and how stable it is!

Thrameos commented 1 year ago

Generally, JPype only has conversions from Python to Java as we leave it to the user to decide what to do with an incoming Java object. This is what allows you to send a Python object to a Java method and have it automatically cast to the correct Java type. For returns we always give Java types, as users may want to convert to a specific Python type (and we can't be sure which type they may want) or they might want to leave it in Java as they are just going to pass it from one Java method to another. If I am wrapping I then set up tables which will convert the type from Java to my specified Python type (more on that later).

The forward conversions are defined in jpype.protocol.py.

@_jcustomizer.JConversion("java.sql.Time", instanceof=datetime.time)
def _toTime(jcls, x):
    return jcls(x.hour, x.minute, x.second)

@_jcustomizer.JConversion("java.sql.Date", instanceof=datetime.date)
def _toDate(jcls, x):
    return jcls(x.year - 1900, x.month - 1, x.day)

@_jcustomizer.JConversion("java.sql.Timestamp", instanceof=datetime.datetime)
def _toTimestamp(jcls, x):
    return jcls(x.year - 1900, x.month - 1, x.day, x.hour, x.minute, x.second, x.microsecond * 1000)

To get them back you would want to add customizers to the Java types to make the more compatible with the Python type. Let me show you an example of this by extending JPype to add a user customization by adding a toDatetime() method.

import jpype
import jpype.imports
import datetime

# We are going to add some user functionality to the Java class 
# to make our lives easier.  This can be executed at any time, even when the 
# JVM is not running.
@jpype.JImplementationFor("java.sql.Timestamp")
class _JTimestamp(object):
    # lets assume that we want java.sql.Timestamp to be able become a Python datetime
    # so we will just add a new method to the Java class at runtime using the customizer.
    def toDatetime(self):
        return datetime.datetime(self.getYear()+1900, self.getMonth()+1, self.getDate(),
                self.getHours(), self.getMinutes(), self. getSeconds(), int(self.getNanos()/1000))

# Lets test it now
jpype.startJVM()
import java
mytime = datetime.datetime(2020, 1, 20, 5, 25, 31, 123)
mytime2 = datetime.datetime(2020, 1, 20, 5, 25, 31, 12)

# Convert a datetime.datetime to a Jave type using the cast operator
jtime = java.sql.Timestamp@mytime
print(type(jtime))
print(type(mytime))
print(mytime==jtime) # gives True because we can promote mytime to Java and check there
print(mytime2==jtime)  # gives False as the promoted type is not equal
print(mytime)
print(jtime.toDatetime())
print(jtime.toDatetime()==mytime) # now lets compare them in Java rather than have them promote to Java

For the types that I already needed for my dbapi2 wrapper you can find the current customizers that I required in the protocol.py.

@_jcustomizer.JImplementationFor('java.sql.Date')
class _JSQLDate:
    def _py(self):
        return datetime.date(self.getYear() + 1900, self.getMonth() + 1, self.getDate())

@_jcustomizer.JImplementationFor('java.sql.Time')
class _JSQLTime:
    def _py(self):
        return datetime.time(self.getHours(), self.getMinutes(), self.getSeconds())

@_jcustomizer.JImplementationFor('java.sql.Timestamp')
class _JDate:
    def _py(self):
        return datetime.datetime(self.getYear() + 1900, self.getMonth() + 1, self.getDate(),
                                 self.getHours(), self.getMinutes(), self.getSeconds(), self.getNanos() // 1000)

So calling _py() on a timestamp will already make a datetime.datetime.

# cast to Java then convert back to Python using the JPype supplied customizer.
print( (java.sql.Timestamp@datetime.datetime(2020, 1, 20, 5, 25, 31, 12))._py())

You will find tables that I used for SQL wrapping in jpype/dbapi2 under the _default_converters section. When returning from a Java call that must return a Python type I just lookup the Java type in the hash table and see if it has a conversion (or if it doesn't we just pass it along).

PS: It usually better to define your own customizers rather than depending on private methods like my _py method, as that way you can customize it to your specifications in terms of types (or if you need timezones and the like).

PySimpleSQL commented 1 year ago

@Thrameos Thanks, that helps a ton! I figured there had to be a better way than I was attempting

ssweber commented 1 year ago

Creating a pull to fix a little issue.

Having an issue with the animated progress bar:

RuntimeError: main thread is not in main loop

PySimpleSQL commented 1 year ago

Yeah, I'm working on that now. I think it's because install-jdk runs in a different thread and tkinter doesn't like it

On Thu, Apr 6, 2023, 11:37 AM ssweber @.***> wrote:

Creating a pull to fix a little issue.

Having an issue with the animated progress bar:

RuntimeError: main thread is not in main loop

— Reply to this email directly, view it on GitHub https://github.com/PySimpleSQL/pysimplesql/issues/244#issuecomment-1499264654, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQX2REUEFLCRMVLYMFF2MO3W73PMNANCNFSM6AAAAAAWQTKYR4 . You are receiving this because you authored the thread.Message ID: @.***>

ssweber commented 1 year ago

Btw, other than the progress bar, the Java installer works correctly now, and everything seems to be functional with the example!

PySimpleSQL commented 1 year ago

Great! I'm honestly surprised how well this all came together, from having Access even working at all (this has to be the only python/Linux database library that works with Access databases, I've looked for a long time), to the new ProgressAnimate, to installing Java automatically :)

Today I'm going to make a separate include file for the Java stuff so it can be reused in multiple examples.

I'm also going to work on the MS Access driver some, and hard-code support for DROP IF EXISTS so that it can be used in a normal way

Could you test something for me? Could you uninstall Java, and use jdk.install with the parameter jre=True and see if the example works? I'm thinking the jre is all that's needed, and is much smaller and faster to download and unzip than the full jdk. It's hard for me to test due to my setup that I use at work.

On Thu, Apr 6, 2023, 9:10 PM ssweber @.***> wrote:

Btw, other than the progress bar, the Java installer works correctly now, and everything seems to be functional with the example!

— Reply to this email directly, view it on GitHub https://github.com/PySimpleSQL/pysimplesql/issues/244#issuecomment-1499812668, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQX2REQKTO4ERFSALMNGQQDW75SQBANCNFSM6AAAAAAWQTKYR4 . You are receiving this because you authored the thread.Message ID: @.***>

ssweber commented 1 year ago

For sure! I’ll let you know. Also need to test the delete cascade… or have you already?

PySimpleSQL commented 1 year ago

I honestly haven't tested much beyond the Journal example, so please test away and let me know what you find out. Hopefully all goes well.

I've also been researching methods to use Windows ODBC drivers in Mac and Linux, so I may mess around with that a little too. It would be nice to reduce the complexity of the current implementation, but at minimum the current driver is usable and seems nice and stable

On Fri, Apr 7, 2023, 7:26 AM ssweber @.***> wrote:

For sure! I’ll let you know. Also need to test the delete cascade… or have you already?

— Reply to this email directly, view it on GitHub https://github.com/PySimpleSQL/pysimplesql/issues/244#issuecomment-1500202117, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQX2RERE7B4MYJPGM3LXV63W772URANCNFSM6AAAAAAWQTKYR4 . You are receiving this because you authored the thread.Message ID: @.***>

ssweber commented 1 year ago

jre=True works fine

PySimpleSQL commented 1 year ago

Great! That should make download and installation pretty quick then

On Fri, Apr 7, 2023, 10:20 AM ssweber @.***> wrote:

jre=True works fine

— Reply to this email directly, view it on GitHub https://github.com/PySimpleSQL/pysimplesql/issues/244#issuecomment-1500333182, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQX2RETUR5Q7HJULSBKTXULXAAPCDANCNFSM6AAAAAAWQTKYR4 . You are receiving this because you authored the thread.Message ID: @.***>

ssweber commented 1 year ago

What do you think about adding the java_install as a msaccess sqldriver parameter? If JAVA_HOME isn’t set?

I plan on improving the demo by using ConfigParser / sg.user_settings_get_entry to save location of previously installed jre, so it doesn’t prompt each time.