sqlalchemy / alembic

A database migrations tool for SQLAlchemy.
MIT License
2.86k stars 248 forks source link

Running `alter table` revisions w/batch operations on SQLite raises an error if table referenced in a view #1207

Open jdavcs opened 1 year ago

jdavcs commented 1 year ago

Describe the bug This may be not so much an Alembic bug as a limitation of SQLite; nevertheless, the following use case results in an error that is not mentioned in the documentation; besides, there seems to be a relatively simple fix for this.

Consider the following scenario:

  1. create table foo
  2. create view myfoo
  3. add a revision that adds a column from foo
  4. run upgrade, then downgrade. The downgrade will raise an error.

The specific schema of foo, as well as the column we are adding/dropping are irrelevant. But here's an example:

########################## REVISION 1 ########################## 
def upgrade() -> None:
    op.create_table("foo", sa.Column("id", sa.Integer, primary_key=True))   

def downgrade() -> None:
    op.drop_table("foo")

########################## REVISION 2 ##########################
def upgrade() -> None:                                                                                                                                                                        
    op.execute("CREATE VIEW my_foo AS SELECT id FROM foo")                                                                                                                                                                                                                                                                                                                                

def downgrade() -> None:                                                                                                                                                                      
    op.execute("DROP VIEW my_foo")

########################## REVISION 3 ##########################
def upgrade() -> None:
    with op.batch_alter_table("foo") as batch_op:
        batch_op.add_column(sa.Column("stuff", sa.String))

def downgrade() -> None:
    with op.batch_alter_table("foo") as batch_op:
         batch_op.drop_column("stuff")

# (I can provide the full revision files if needed)

Running migrations results in the following:

$ alembic upgrade heads
INFO  [alembic.runtime.migration] Context impl SQLiteImpl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
INFO  [alembic.runtime.migration] Running upgrade  -> 0a35dd0c400b, create table foo
INFO  [alembic.runtime.migration] Running upgrade 0a35dd0c400b -> 70ec189d8dea, create view myfoo
INFO  [alembic.runtime.migration] Running upgrade 70ec189d8dea -> c907a34d4a26, add column to table foo

$ alembic downgrade -1
INFO  [alembic.runtime.migration] Context impl SQLiteImpl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
INFO  [alembic.runtime.migration] Running downgrade c907a34d4a26 -> 70ec189d8dea, add column to table foo
Traceback (most recent call last):
  File "/home/jdavcs/0dev/drafts/python/sqlalchemy/views/v2/.venv/lib64/python3.9/site-packages/sqlalchemy/engine/base.py", line 1964, in _exec_single_context
    self.dialect.do_execute(
  File "/home/jdavcs/0dev/drafts/python/sqlalchemy/views/v2/.venv/lib64/python3.9/site-packages/sqlalchemy/engine/default.py", line 748, in do_execute
    cursor.execute(statement, parameters)
sqlite3.OperationalError: error in view my_foo: no such table: main.foo
... [output truncated]
[SQL: ALTER TABLE _alembic_tmp_foo RENAME TO foo]

I believe this is happening primarily due to how SQLite handles the ALTER TABLE statement. The error can be easily reproduced by reducing the logic of the above revisions to the following SQL statements (which is a simplified version of what alembic does):

CREATE TABLE foo (id INTEGER, data INTEGER);
CREATE VIEW fooview AS SELECT id FROM foo;
/* here we drop the `data` column from foo by altering the table via copy/drop/rename */
CREATE TABLE tmp_foo (id INTEGER);
/* skipped step as irrelevant: here we'd copy the rows from foo into tmp_foo */
DROP TABLE foo;
ALTER TABLE tmp_foo RENAME TO foo;  /* Error: error in view fooview: no such table: main.foo */

Expected behavior No error, OR a note in the documentation on batch migrations explaining that if a view references the table being altered, and SQLite's version is < 3.35, an error will happen.

To Reproduce See description. SQLite version: 3.34 (must be < 3.35, as 3.35 introduced the ALTER TABLE DROP COLUMN statement.)

Possible Solution As per SQLite's documentation, enabling the legacy_alter_table setting should solve this. Indeed, adding the PRAGMA legacy_alter_table=1 statement to the example above appears to solve the problem. The following code runs without errors, making the correct changes to the database:

CREATE TABLE foo (id INTEGER, data INTEGER);
CREATE VIEW fooview AS SELECT id FROM foo;
CREATE TABLE tmp_foo (id INTEGER);
DROP TABLE foo;
PRAGMA legacy_alter_table=1;
ALTER TABLE tmp_foo RENAME TO foo;
PRAGMA legacy_alter_table=0

Versions.

zzzeek commented 1 year ago

hi -

I'm not understanding the SQLite <3.35 requirement. If 3.35 introduced a DROP COLUMN alter, then that's great, but IIUC Alembic is not using that for a SQLite batch migration.

not understanding the legacy_alter_table part either - per the doc, that means our RENAME will not affect the view. But the DROP TABLE foo would still have to error out, I would think.

looking at the problem I would think it's not solvable unless the view is fully dropped and recreated outside the block.

jdavcs commented 1 year ago

Thanks for the fast reply!

I'm not understanding the SQLite <3.35 requirement. If 3.35 introduced a DROP COLUMN alter, then that's great, but IIUC Alembic is not using that for a SQLite batch migration.

Then I assumed incorrectly. I thought alembic would check if DROP COLUMN was available, and use the copy/rename/drop approach only if it were not available (I noticed that's the behavior on ALTER TABLE ADD COLUMN: my script uses batch ops, but the generated sql just uses the add column statement). In this case <3.35 is irrelevant.

not understanding the legacy_alter_table part either - per the doc, that means our RENAME will not affect the view. But the DROP TABLE foo would still have to error out, I would think.

Based on what I see in sqlite, the DROP TABLE doesn't raise an error, it's the RENAME that raises it. Here's a detailed transcript w/inline comments:

$ sqlite3 demo.sqlite
SQLite version 3.34.1 2021-01-20 14:10:07
sqlite> CREATE TABLE foo (id INTEGER, data INTEGER);
CREATE VIEW fooview AS SELECT id FROM foo;
CREATE TABLE tmp_foo (id INTEGER);
DROP TABLE foo;

# At this point we have the new tmp_foo table and the old view. 
# The drop succeeded, and the view points to the old, now non-existent table.

sqlite> .tables
fooview  tmp_foo
sqlite> .schema fooview
CREATE VIEW fooview AS SELECT id FROM foo;

# Now trigger the error: 

sqlite> ALTER TABLE tmp_foo RENAME TO foo;
Error: error in view fooview: no such table: main.foo

# My guess is that it checked the sqlite_schema table and found an entry that doesn't parse
(the view pointing to foo):

sqlite> select * from sqlite_schema;
view|fooview|fooview|0|CREATE VIEW fooview AS SELECT id FROM foo
table|tmp_foo|tmp_foo|3|CREATE TABLE tmp_foo (id INTEGER)

# So here if we issue the PRAGMA statement enabling the legacy mode, I think it does NOT make that check:

sqlite> PRAGMA legacy_alter_table=1;
sqlite> ALTER TABLE tmp_foo RENAME TO foo;

# And we end with no errors and the expected schema/database state:

sqlite> .schema fooview
CREATE VIEW fooview AS SELECT id FROM foo
/* fooview(id) */;
sqlite> .tables
foo      fooview

With regard to that statement affecting the RENAME operation, I've checked, and here's what it does:

looking at the problem I would think it's not solvable unless the view is fully dropped and recreated outside the block.

If the legacy_alter_table setting is not an option, then that would be my guess too. In that case, do you think it would make sense to add a note to the documentation? Unless this is too much of an edge case.

CaselIT commented 1 year ago

I'm not sure if there is anything to do here. @jdavcs what would a solution be here?

Is documenting that pragma an option?

jdavcs commented 1 year ago

@CaselIT Yes, I think mentioning it in the documentation would be helpful. I've implemented it like this; we use it to wrap our batch operation statements. It solved our problem. If you think this can be added to the docs, I'm happy to open a draft PR.

CaselIT commented 1 year ago

Ok, care to provide a PR with some docs? Your implementation could also maybe be a cookbook recipe

jdavcs commented 1 year ago

Will do, thanks!

CaselIT commented 1 year ago

Thanks!