pudo / dataset

Easy-to-use data handling for SQL data stores with support for implicit table creation, bulk loading, and transactions.
https://dataset.readthedocs.org/
MIT License
4.78k stars 298 forks source link

Looking for help: migrating dataset to sqla 2.0.0 #411

Open pudo opened 1 year ago

pudo commented 1 year ago

Hey all. I've just pushed dataset 1.6.0, which pins the sqlalchemy dependency for this library to >= 1.3.2, < 2.0.0. This is meant as a hotfix to prevent people from getting broken installs by default.

Beyond that, ensuring that dataset will run with both sqlalchemy 1.3, 1.4 and 2.0 will be a bit of work - not horrible, but a solid afternoon of coding. It's forced me to realise that I haven't been running dataset in any of my projects for over two years, and this is also starting to show in some of the developer tooling the lib is using. So I'd like to send out a ping to the community and see if anyone is interested in taking the lead in adapting the library, perhaps even with a long-term view to becoming its new maintainer.

karlicoss commented 1 year ago

I'll see if I can migrate it at least for now

pudo commented 1 year ago

@karlicoss That's great, thank you!

karlicoss commented 1 year ago

So I did some work on it here, but haven't quite managed to finish :( A bit stuck, so leaving in here in case someone wants to pick up/help

https://github.com/pudo/dataset/compare/master...karlicoss:dataset:sqlalchemy-2.0

on sqlalchemy 2.0 I'm getting five tests failing (on sqlite)

FAILED test/test_dataset.py::DatabaseTestCase::test_with - sqlalchemy.exc.InvalidRequestError: This connection has already initialized a SQLAlchemy Transaction() object via begin() or autobegin...
FAILED test/test_dataset.py::TableTestCase::test_table_drop_then_create - sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) table weather has no column named foo
FAILED test/test_dataset.py::TableTestCase::test_update_while_iter - sqlalchemy.exc.InterfaceError: (sqlite3.InterfaceError) Cursor needed to be reset because of commit/rollback and can no longer be fetc...
FAILED test/test_dataset.py::TableTestCase::test_insert_json - AssertionError: 1
FAILED test/test_dataset.py::TableTestCase::test_upsert_many - assert None == 10

For the first three -- not sure what's texactly the issue, for the last two the problems seems to be in reflect_table -- it seems to drop data or something.

It would be cool to merge the changes as is at least, since they are passing for sqlite on sqlachemy 1.4 and I think would at least allow to at least use dataset in read only mode with sqlalchemy 2.0

But I'm getting one test failure for postgres, weird that it's the only one, can't quite figure out why:

__________________ DatabaseTestCase.test_create_table_no_ids ___________________

self = <test.test_dataset.DatabaseTestCase testMethod=test_create_table_no_ids>

    def test_create_table_no_ids(self):
        if "mysql" in self.db.engine.dialect.dbapi.__name__:
            return
        if "sqlite" in self.db.engine.dialect.dbapi.__name__:
            return
        table = self.db.create_table("foo_no_id", primary_id=False)
>       assert table.table.exists()
M4rque2 commented 1 year ago

The 5 errors are all because SQLAlchemy removes auto commit in 2.0.0b1 version, I fix it and all tests passed. However I am not sure there is absolute no any bugs

vladiscripts commented 4 months ago

If all tests are passed, maybe it's time to merge?

pudo commented 4 months ago

I never got most of it to pass, particularly against engines other than SQLite (postgres, mysql). The problem is that the way dataset is doing transactions is fundamentally super loose, and newer versions of SQLAlchemy don't really enable that very well. If you end up with a working branch that runs against SQLite/MySQL and Postgres I'm super happy to merge.