fractal-analytics-platform / fractal-server

Fractal backend
https://fractal-analytics-platform.github.io/fractal-server/
BSD 3-Clause "New" or "Revised" License
10 stars 3 forks source link

SQLite and ALTER #732

Closed tcompa closed 1 year ago

tcompa commented 1 year ago

We have a known issue with SQLite an ALTER - see e.g. https://github.com/fractal-analytics-platform/fractal-server/issues/726. For the moment, we kept resetting the DB table, including as part of the upcoming v1.3, but this implies that each time we hit one of those cases we have a fractal-server update that is totally disruptive (as it requires to start with a fresh db).

In principle there exist workarounds, on the SQLite side (e.g. the render_as_batch=True, see link), but we should debug this further. We currently do have it, in the offline version of https://github.com/fractal-analytics-platform/fractal-server/blob/main/fractal_server/migrations/env.py, but it is clearly not working yet.

mfranzon commented 1 year ago

@tcompa Adding render_as_batch=True also to online migrations should fix the problem.

#~/env.py
...
def do_run_migrations(connection: Connection) -> None:
    context.configure(connection=connection, target_metadata=target_metadata,
                                render_as_batch=True) # this is the configuration for online method

    with context.begin_transaction():
        context.run_migrations()
...
tcompa commented 1 year ago

Maybe w still need to check:

tcompa commented 1 year ago

Here is the log of a working test.

$ git checkout main

$ git diff app/models/task.py
diff --git a/fractal_server/app/models/task.py b/fractal_server/app/models/task.py
index 505a4db4..b95f3006 100644
--- a/fractal_server/app/models/task.py
+++ b/fractal_server/app/models/task.py
@@ -33,8 +33,8 @@ class Task(_TaskBase, SQLModel, table=True):

     id: Optional[int] = Field(default=None, primary_key=True)
     name: str
-    command: str
-    source: str = Field(unique=True)
+    command: str = Field(unique=True)
+    source: str = Field(unique=False)
     input_type: str
     output_type: str
     meta: Optional[dict[str, Any]] = Field(sa_column=Column(JSON), default={})

$ export SQLITE_PATH=/tmp/some-test.db

$ rm /tmp/some-test.db

$ poetry run fractalctl set-db
Run alembic.config, with argv=['-c', '/home/tommaso/Fractal/fractal-server/fractal_server/alembic.ini', 'upgrade', 'head']
fractal_server.app.db
INFO  [alembic.runtime.migration] Context impl SQLiteImpl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
INFO  [alembic.runtime.migration] Running upgrade  -> 50a13d6138fd, Initial schema
INFO  [alembic.runtime.migration] Running upgrade 50a13d6138fd -> 4c308bcaea2b, Add task.args_schema and task.args_schema_version
INFO  [alembic.runtime.migration] Running upgrade 4c308bcaea2b -> f384e1c0cf5d, Drop task.default_args columns

$ poetry run alembic revision --autogenerate -m "add/remove unique constraints"
fractal_server.app.db
INFO  [alembic.runtime.migration] Context impl SQLiteImpl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
INFO  [alembic.autogenerate.compare] Detected removed unique constraint 'uq_task_source' on 'task'
INFO  [alembic.autogenerate.compare] Detected added unique constraint 'None' on '['command']'
  Generating /home/tommaso/Fractal/fractal-server/fractal_server/migrations/versions/40ff039f5631_add_remove_unique_constraints.py ...  done

Now we have a new file migrations/versions/40ff039f5631_add_remove_unique_constraints.py

"""add/remove unique constraints

Revision ID: 40ff039f5631
Revises: f384e1c0cf5d
Create Date: 2023-06-29 12:36:57.829769

"""
from alembic import op
import sqlalchemy as sa
import sqlalchemy_utils
import sqlmodel

# revision identifiers, used by Alembic.
revision = '40ff039f5631'
down_revision = 'f384e1c0cf5d'
branch_labels = None
depends_on = None

def upgrade() -> None:
    # ### commands auto generated by Alembic - please adjust! ###
    with op.batch_alter_table('task', schema=None) as batch_op:
        batch_op.drop_constraint('uq_task_source', type_='unique')
        batch_op.create_unique_constraint(None, ['command'])

    # ### end Alembic commands ###

def downgrade() -> None:
    # ### commands auto generated by Alembic - please adjust! ###
    with op.batch_alter_table('task', schema=None) as batch_op:
        batch_op.drop_constraint(None, type_='unique')
        batch_op.create_unique_constraint('uq_task_source', ['source'])

    # ### end Alembic commands ###

and we can successfully go as in

$ rm /tmp/some-test.db

$ poetry run fractalctl set-db
Run alembic.config, with argv=['-c', '/home/tommaso/Fractal/fractal-server/fractal_server/alembic.ini', 'upgrade', 'head']
fractal_server.app.db
INFO  [alembic.runtime.migration] Context impl SQLiteImpl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
INFO  [alembic.runtime.migration] Running upgrade  -> 50a13d6138fd, Initial schema
INFO  [alembic.runtime.migration] Running upgrade 50a13d6138fd -> 4c308bcaea2b, Add task.args_schema and task.args_schema_version
INFO  [alembic.runtime.migration] Running upgrade 4c308bcaea2b -> f384e1c0cf5d, Drop task.default_args columns
INFO  [alembic.runtime.migration] Running upgrade f384e1c0cf5d -> 40ff039f5631, add/remove unique constraints

This is all as expected, thanks @ychiucco and @mfranzon! The only doubt left is why some constraints are named and some other are named as None - see python file above.

tcompa commented 1 year ago

Now I repeat the same test with 865cf78d9f0a5281cab90c65765e7c2af7bb39e3, which is not expected to work.

The migration file looks like

"""add/remove unique constraints

Revision ID: b352322ab10e
Revises: f384e1c0cf5d
Create Date: 2023-06-29 12:41:53.979756

"""
from alembic import op
import sqlalchemy as sa
import sqlalchemy_utils
import sqlmodel

# revision identifiers, used by Alembic.
revision = 'b352322ab10e'
down_revision = 'f384e1c0cf5d'
branch_labels = None
depends_on = None

def upgrade() -> None:
    # ### commands auto generated by Alembic - please adjust! ###
    op.create_unique_constraint(None, 'task', ['command'])
    # ### end Alembic commands ###

def downgrade() -> None:
    # ### commands auto generated by Alembic - please adjust! ###
    op.drop_constraint(None, 'task', type_='unique')
    # ### end Alembic commands ###

without names. Thus we get the usual error

$ poetry run fractalctl set-db
Run alembic.config, with argv=['-c', '/home/tommaso/Fractal/fractal-server/fractal_server/alembic.ini', 'upgrade', 'head']
fractal_server.app.db
INFO  [alembic.runtime.migration] Context impl SQLiteImpl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
INFO  [alembic.runtime.migration] Running upgrade  -> 50a13d6138fd, Initial schema
INFO  [alembic.runtime.migration] Running upgrade 50a13d6138fd -> 4c308bcaea2b, Add task.args_schema and task.args_schema_version
INFO  [alembic.runtime.migration] Running upgrade 4c308bcaea2b -> f384e1c0cf5d, Drop task.default_args columns
INFO  [alembic.runtime.migration] Running upgrade f384e1c0cf5d -> b352322ab10e, add/remove unique constraints
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/tommaso/Fractal/fractal-server/fractal_server/__main__.py", line 83, in run
    set_db()
  File "/home/tommaso/Fractal/fractal-server/fractal_server/__main__.py", line 74, in set_db
    alembic.config.main(argv=alembic_args)
  File "/home/tommaso/.cache/pypoetry/virtualenvs/fractal-server-cyevISt_-py3.10/lib/python3.10/site-packages/alembic/config.py", line 632, in main
    CommandLine(prog=prog).main(argv=argv)
  File "/home/tommaso/.cache/pypoetry/virtualenvs/fractal-server-cyevISt_-py3.10/lib/python3.10/site-packages/alembic/config.py", line 626, in main
    self.run_cmd(cfg, options)
  File "/home/tommaso/.cache/pypoetry/virtualenvs/fractal-server-cyevISt_-py3.10/lib/python3.10/site-packages/alembic/config.py", line 603, in run_cmd
    fn(
  File "/home/tommaso/.cache/pypoetry/virtualenvs/fractal-server-cyevISt_-py3.10/lib/python3.10/site-packages/alembic/command.py", line 385, in upgrade
    script.run_env()
  File "/home/tommaso/.cache/pypoetry/virtualenvs/fractal-server-cyevISt_-py3.10/lib/python3.10/site-packages/alembic/script/base.py", line 582, in run_env
    util.load_python_file(self.dir, "env.py")
  File "/home/tommaso/.cache/pypoetry/virtualenvs/fractal-server-cyevISt_-py3.10/lib/python3.10/site-packages/alembic/util/pyfiles.py", line 94, in load_python_file
    module = load_module_py(module_id, path)
  File "/home/tommaso/.cache/pypoetry/virtualenvs/fractal-server-cyevISt_-py3.10/lib/python3.10/site-packages/alembic/util/pyfiles.py", line 110, in load_module_py
    spec.loader.exec_module(module)  # type: ignore
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/home/tommaso/Fractal/fractal-server/fractal_server/migrations/env.py", line 84, in <module>
    asyncio.run(run_migrations_online())
  File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete
    return future.result()
  File "/home/tommaso/Fractal/fractal-server/fractal_server/migrations/env.py", line 76, in run_migrations_online
    await connection.run_sync(do_run_migrations)
  File "/home/tommaso/.cache/pypoetry/virtualenvs/fractal-server-cyevISt_-py3.10/lib/python3.10/site-packages/sqlalchemy/ext/asyncio/engine.py", line 548, in run_sync
    return await greenlet_spawn(fn, conn, *arg, **kw)
  File "/home/tommaso/.cache/pypoetry/virtualenvs/fractal-server-cyevISt_-py3.10/lib/python3.10/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 128, in greenlet_spawn
    result = context.switch(value)
  File "/home/tommaso/Fractal/fractal-server/fractal_server/migrations/env.py", line 62, in do_run_migrations
    context.run_migrations()
  File "<string>", line 8, in run_migrations
  File "/home/tommaso/.cache/pypoetry/virtualenvs/fractal-server-cyevISt_-py3.10/lib/python3.10/site-packages/alembic/runtime/environment.py", line 928, in run_migrations
    self.get_context().run_migrations(**kw)
  File "/home/tommaso/.cache/pypoetry/virtualenvs/fractal-server-cyevISt_-py3.10/lib/python3.10/site-packages/alembic/runtime/migration.py", line 628, in run_migrations
    step.migration_fn(**kw)
  File "/home/tommaso/Fractal/fractal-server/fractal_server/migrations/versions/b352322ab10e_add_remove_unique_constraints.py", line 23, in upgrade
    op.create_unique_constraint(None, 'task', ['command'])
  File "<string>", line 8, in create_unique_constraint
  File "<string>", line 3, in create_unique_constraint
  File "/home/tommaso/.cache/pypoetry/virtualenvs/fractal-server-cyevISt_-py3.10/lib/python3.10/site-packages/alembic/operations/ops.py", line 480, in create_unique_constraint
    return operations.invoke(op)
  File "/home/tommaso/.cache/pypoetry/virtualenvs/fractal-server-cyevISt_-py3.10/lib/python3.10/site-packages/alembic/operations/base.py", line 395, in invoke
    return fn(self, operation)
  File "/home/tommaso/.cache/pypoetry/virtualenvs/fractal-server-cyevISt_-py3.10/lib/python3.10/site-packages/alembic/operations/toimpl.py", line 175, in create_constraint
    operations.impl.add_constraint(
  File "/home/tommaso/.cache/pypoetry/virtualenvs/fractal-server-cyevISt_-py3.10/lib/python3.10/site-packages/alembic/ddl/sqlite.py", line 75, in add_constraint
    raise NotImplementedError(
NotImplementedError: No support for ALTER of constraints in SQLite dialect. Please refer to the batch mode feature which allows for SQLite migrations using a copy-and-move strategy.
tcompa commented 1 year ago

All is expected, but there's another point that I find confusing. It looks like the naming convention is only used when generating revisions based on changes, but not to generate the first one. If I remove all migrations/versions files and again generate the first schema, I get

...
def upgrade() -> None:
    # ### commands auto generated by Alembic - please adjust! ###
    op.create_table('project',
    sa.Column('name', sqlmodel.sql.sqltypes.AutoString(), nullable=False),
    sa.Column('read_only', sa.Boolean(), nullable=False),
    sa.Column('id', sa.Integer(), nullable=False),
    sa.PrimaryKeyConstraint('id')
    )
    op.create_table('state',
    sa.Column('data', sa.JSON(), nullable=True),
    sa.Column('timestamp', sa.DateTime(timezone=True), nullable=True),
    sa.Column('id', sa.Integer(), nullable=False),
    sa.PrimaryKeyConstraint('id')
    )
    op.create_table('task',
    sa.Column('meta', sa.JSON(), nullable=True),
    sa.Column('args_schema', sa.JSON(), nullable=True),
    sa.Column('source', sqlmodel.sql.sqltypes.AutoString(), nullable=False),
    sa.Column('id', sa.Integer(), nullable=False),
    sa.Column('name', sqlmodel.sql.sqltypes.AutoString(), nullable=False),
    sa.Column('command', sqlmodel.sql.sqltypes.AutoString(), nullable=False),
    sa.Column('input_type', sqlmodel.sql.sqltypes.AutoString(), nullable=False),
    sa.Column('output_type', sqlmodel.sql.sqltypes.AutoString(), nullable=False),
    sa.Column('owner', sqlmodel.sql.sqltypes.AutoString(), nullable=True),
    sa.Column('version', sqlmodel.sql.sqltypes.AutoString(), nullable=True),
    sa.Column('args_schema_version', sqlmodel.sql.sqltypes.AutoString(), nullable=True),
    sa.PrimaryKeyConstraint('id'),
    sa.UniqueConstraint('source')
    )
...

while I would expect the last line to look like

    sa.UniqueConstraint('source', name="uq_task_source")

I can set this name by hand as part of the initial-schema python file, and then everything seems to work as expected. But are we possibly missing some config that would do that for us?

tcompa commented 1 year ago

Some clarifications:

  1. It is now understood why UniqueConstraint('source') appears without name=.... The explanation is the first code snippet in https://docs.sqlalchemy.org/en/20/core/constraints.html#configuring-a-naming-convention-for-a-metadata-collection: Screenshot from 2023-06-29 14-39-47 which doesn't have the name attribute in UniqueConstraint but still uses the naming convention. The third code snippet then makes it clear that names do explicitly appear in the migration scripts.

  2. I'm thinking about restarting the DB once again with a new initial schema. I've tried locally, and if we do it in main (that is, with the naming convention in place), then we avoid the "issue" shown in https://github.com/fractal-analytics-platform/fractal-server/issues/732#issuecomment-1612851895, namely the fact that migration scripts show some unnamed constraints (with Nones where the actual names should be). I should try this out a bit further, to see if it can be also done to re-encode existing data in the new schema.