gordthompson / sqlalchemy-access

A Microsoft Access dialect for SQLAlchemy.
MIT License
50 stars 9 forks source link

AutoMap support - reflect primary key columns #9

Closed liam-veitch closed 4 years ago

liam-veitch commented 4 years ago

Hi, thanks for the updates to this package!

Not sure if this is related to my specific database, but no primary keys appear to be returned when trying automap_base / metadata.reflect.

Is this a limitation of pyodbc; or is this something which can be supported in the sqlalchemy-access?

Connection is like this :

connection_string = (
        r'DRIVER={Microsoft Access Driver (*.mdb, *.accdb)};'
        f'DBQ={db_file}'
    )
connection_url = f"access+pyodbc:///?odbc_connect={urllib.parse.quote_plus(connection_string)}"
    engine = sqlalchemy.create_engine(connection_url)
    metadata = sqlalchemy.MetaData()
    metadata.reflect(engine, only=['My Table'])
    Base = automap_base(metadata=metadata)
gordthompson commented 4 years ago

It is a limitation of the Access ODBC driver. When pyodbc calls the ODBC SQLPrimaryKeys function the driver throws a "Driver does not support this function" error:

https://github.com/gordthompson/sqlalchemy-access/blob/49758166850e33766fe58eb5d27936edb1fd1f6b/sqlalchemy_access/base.py#L724-L727

liam-veitch commented 4 years ago

This could be one option, I have placed code within exception handler but really it could be the main function.

        ```

if ie.args[0] == "IM001":

('IM001', '[IM001] [Microsoft][ODBC Driver Manager] Driver does not support this function (0) (SQLPrimaryKeys)')

            pks = [r[8] for r in pyodbc_crsr.statistics(table_name) if r[5] == 'PrimaryKey']
            if len(pks) > 0:
                return {'constrained_columns': pks, 'name': 'PK'}
            else:
                return []
gordthompson commented 4 years ago

Nice idea, but the primary key index is not always named PrimaryKey. It is if you create the table in the Access UI, but if the table is created with

CREATE TABLE ddl_single (id LONG PRIMARY KEY, txt TEXT(50))

then the primary key index is named something like Index_DEA703EE_F908_4F52.

liam-veitch commented 4 years ago

That is a shame, looking at the documentation for SQLStatistics, I think we could only go as far as to say that an index was unique; but not definitively that it is the Primary Key.

gordthompson commented 4 years ago

Closing for now. Feel free to re-open if you have any new information or want to discuss this further.

gordthompson commented 3 years ago

Version 1.0.7 added warnings for the unsupported ODBC functions.

tonytheodore commented 3 years ago

Nice idea, but the primary key index is not always named PrimaryKey

The important column in statistics, non_unique (also used in get_indexes), identifies all unique keys. One could use some heuristic like:

if more than one unique key:
    use "PrimaryKey" if exists
    use key with least number of columns
    use key with most references (see below)

I'm guessing unique keys with extra columns are generally some sort of covering/optimisation index - then again, single column uniques could be auto-generated surrogates that aren't really part of the logical model. Many tables created via wizard will have such keys.

Who knows? A warning about using an inferred primary key is possibly more helpful - it's a starting point to refine the source database instead of waiting for upstream changes.

Another way to infer an ambiguous primary key would be to check the foreign keys that reference it. I assume people refine their models over time and start using explicit keys beyond the suggested ones without deleting the originals.

I see in the history that system tables can cause problems, but querying MSysRelationships could round out both primary and foreign key support. If it fails, it fails - again, better to start with an educated guess than relying on upstream support. Given how mature ODBC is, one can imagine Microsoft may never add support for these functions.

The other approach (maybe as an additional fallback), is DAO mentioned in the header for jet2sql.py. A mechanical 2to3 conversion and new DAO version generates a reasonably complete schema (data type mapping aside). It will only work on Windows, but I fell back to that since I could only get iODBC working on the Mac, while pyodbc is linked to unixODBC...

gordthompson commented 3 years ago

Hi @tonytheodore . Thanks for the ideas.

In the course of reviving this dialect I considered a few of the suggestions you mention (e.g., using DAO to get at the "real" metadata) but ultimately decided against it. As stated in the README, the main purpose of the dialect is to help people who just want to save a pandas DataFrame into an Access database. .to_sql() requires a SQLAlchemy Engine/Connectable for any database other than SQLite, so the dialect was the missing piece of the puzzle.

If Microsoft decided to fix the deficiencies in the Access ODBC driver then I would consider extending support for things like reflection, thus enabling people to use a tool like sqlacodegen. However, as you said, if it hasn't happened by now then it seems rather unlikely that Microsoft will bother.

tonytheodore commented 3 years ago

Hi @gordthompson, thanks for the quick response.

As stated in the README

Ouch! I skipped the objectives and pre-requisites, the latter would have saved me many hours of frustration over the past few days.

the main purpose of the dialect is to help people who just want to save a pandas DataFrame into an Access database

My end goal is something similar, but to get there I need to do something like:

The first three steps all benefit from solid schema support and this dialect looks like it's 90-95% there. Closer to 99% when I consider my original plan was to write something like jet2sql.py in VBA and add it to every database I want to convert. And that approach wouldn't afford a programmatic schema.

I'll have a play with DAO and see if there's something beyond works-for-me to contribute a PR. I feel there's an optional fallback/overlay to complement the great work you've already done.

gordthompson commented 3 years ago

related: https://stackoverflow.com/a/37713906/2144390

gordthompson commented 3 years ago

Version 1.1.0 (released 2021-08-04) added PK and FK reflection via ACE DAO.