Open M4rque2 opened 1 year ago
This is great! Solves #411 imho
@pudo Do you think this could be merged?
It will also need a bump in setup.py
, no?
Howdy. We are using a library that depends on dataset and would like to see what specifically is preventing this pull request and how we can help git it where it needs to be. Is this pull request adequate or what is holding it back?
Ok, so I've played with this a little bit on #420, and the problem is that this will only work on SQLite, not any DB with proper transactions management. SQLAlchemy has fundamentally gotten rid of autocommit mode, which was underpinning dataset
. While there's probably a way of faking autocommit mode on the library level, I wonder if the cleaner approach wouldn't be to actually make transaction management in dataset more explicit. It gets rid of some of the magicness, but would probably lead people towards using tx'en correctly and therefore creating much faster scripts.
The more I dig into this the less certain I am that we're not due a massive re-work of the library in order to harmonize transaction handling and how it intersects with DDL commands. At the moment, the tests are producing a bunch of deadlocks when running against PostgreSQL.
Does anyone here really understand transactions, and how they relate to schema modification?
@pudo
I wonder if the cleaner approach wouldn't be to actually make transaction management in dataset more explicit. It gets rid of some of the magicness (...)
I think the default should be autocommit, as that's what (IMHO) one would generally expect from a library like dataset. dataset
's advantage is to avoid sqlalchemy's complexity, and I feel making transactions explicit as a default would take away from this. For more transaction control, dataset already provides a respective API, right? https://dataset.readthedocs.io/en/latest/quickstart.html#using-transactions
For the case where full transaction control and speed (and possibly explicit control over DDL) is a driving critiera, I suggest sqlalchemy native is a better choice.
@pudo
The more I dig into this the less certain I am that we're not due a massive re-work of the library in order to harmonize transaction handling and how it intersects with DDL commands.
I added my thoughts , hope it helps (I don't know the library well so I may be wrong). Afaik DDL commands cannot be executed inside another transaction, at least that's what the alembic docs on autocommit_block() seem to imply:
This special directive is intended to support the occasional database DDL or system operation that specifically has to be run outside of any kind of transaction block. The PostgreSQL database platform is the most common target for this style of operation, as many of its DDL operations must be run outside of transaction blocks, even though the database overall supports transactional DDL.
All unit tests passed