googleapis / python-spanner-sqlalchemy

Apache License 2.0
39 stars 28 forks source link

Introspection method returns primary key indexes, making Alembic migrations inconvenient #231

Closed IlyaFaer closed 2 years ago

IlyaFaer commented 2 years ago

When we generate migrations with Alembic, it always tries to drop a bunch of indexes. Running the migration fails because they are used by foreign keys.

The dropped indexes are those for the primary key of each table. Those indexes are never explicitly created by Alembic, but they are implicitly by virtue of being on primary key columns. This might explain why Alembic tries to drop them. Right now, we delete those drop_index manually every time we generate a migration, which is tedious

IlyaFaer commented 2 years ago

@asthamohta, @ansh0l

IlyaFaer commented 2 years ago

So, the proposed idea was that Alembic is trying to drop primary keys of tables (while primary keys should be managed automatically by Spanner and never touched by users). To build a migration, Alembic uses an introspection method to understand what is the table structure and what needs to be done. All of it is happening here:

https://github.com/sqlalchemy/alembic/blob/873027160f88af277b73223e9fe33bfba315532f/alembic/autogenerate/compare.py#L570

https://github.com/sqlalchemy/sqlalchemy/blob/f8c4dba4e9f130c18ce00597c036bc26ae7abf90/lib/sqlalchemy/engine/reflection.py#L1135

And the dialect introspection method is already ignoring primary key indexes: https://github.com/googleapis/python-spanner-sqlalchemy/blob/8bab5fb33cb7d08905b5ec5c17ee8f26b848b359/google/cloud/sqlalchemy_spanner/sqlalchemy_spanner.py#L691

Thus, it seems Alembic should not try to delete primary key indexes, as it'll not see them in an existing table.

I've tried to reproduce the case by creating two tables, one of which is foreign keyed into another, and introspection works fine. I even tried to delete the tables (when it's done with SQLAlchemy, it first drops the table indexes - otherwise Spanner will not allow to drop the table, breaking half of SQLAlchemy functionality). Here are codes:

CREATE TABLE Singers (
            SingerId     INT64 NOT NULL,
            FirstName    STRING(1024),
            Params       ARRAY<JSON>,
        ) PRIMARY KEY (SingerId)
CREATE TABLE Song (
        SongId INT64 NOT NULL,
        Name STRING(256),
        Author INT64 NOT NULL,
        CONSTRAINT FK_SongAuthor FOREIGN KEY (Author) REFERENCES Singers (SingerId)
    ) PRIMARY KEY (SongId)

Introspection method used by Alembic:

from sqlalchemy import create_engine, inspect, Table, MetaData

engine = create_engine(
    "spanner+spanner:///projects/*/instances/*/databases/*"
)

inspector = inspect(engine)
print(inspector.get_indexes("Singers"))

# table = Table("Singers", MetaData(bind=engine), autoload=True)
# table.drop()

And here is what was generated and executed by SQLAlchemy on my side:

DROP INDEX `IDX_Song_Author_C8CE5D5C92C63485`;ALTER TABLE Song DROP CONSTRAINT `FK_SongAuthor`;
DROP TABLE Song;

No primary keys affected.

Thus, I suspect the problem is not really related to introspection and primary keys. As Alembic's support for indexes and foreign keys constraints is declared to be only basic, the root of the issue is probably in Alembic itself. It'd be good to see traceback of the original error and the migration script generated by Alembic.

ansh0l commented 2 years ago

@louimet Would you be able able to take a look at this issue and let us know the error you are facing?

louimet commented 2 years ago

I started from scratch to be sure to be able to reproduce the error.

Using the following declarative classes, I generated first a migration with only the Test class. Then a second one that added the Test2 class, with the foreign key on the first table.

class Test(Base):
    __tablename__ = 'test'

    id = Column(String(3), primary_key=True)
    name = Column(String(3), ForeignKey('test2.name'))

class Test2(Base):
    __tablename__ = 'test2'

    id = Column(String(3), primary_key=True)
    name = Column(String(3))

Running those two migrations works fine. But then, if I leave everything as is, i.e. don't change a single thing in my classes, and try to generate a new revision using --autogenerate, I get the following migration :

def upgrade() -> None:
    with op.batch_alter_table('test', schema=None) as batch_op:
        batch_op.drop_index('IDX_test_name_N_02FB4FA83F2E5433')

    with op.batch_alter_table('test2', schema=None) as batch_op:
        batch_op.drop_constraint('IDX_test2_name_U_3D6929C9E41B453D', type_='unique')
        batch_op.drop_index('IDX_test2_name_U_3D6929C9E41B453D')

Running this gives :

INFO  [alembic.runtime.migration] Running upgrade  -> 67449554fba0, initial generation
INFO  [alembic.runtime.migration] Running upgrade 67449554fba0 -> 1a4eb9df42b0, add foreign key table
INFO  [alembic.runtime.migration] Running upgrade 1a4eb9df42b0 -> c7168018b6b3, test drop index
Traceback (most recent call last):
  File "/root/.local/lib/python3.9/site-packages/google/api_core/grpc_helpers.py", line 50, in error_remapped_callable
    return callable_(*args, **kwargs)
  File "/root/.local/lib/python3.9/site-packages/grpc/_channel.py", line 946, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/root/.local/lib/python3.9/site-packages/grpc/_channel.py", line 849, in _end_unary_response_blocking
    raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.FAILED_PRECONDITION
        details = "Cannot drop index `IDX_test_name_N_02FB4FA83F2E5433`. It is in use by foreign keys: `FK_test_test2_B0D0A324DEB5002B_1`."
        debug_error_string = "{"created":"@1661273130.815627136","description":"Error received from peer ipv4:127.0.0.1:9010","file":"src/core/lib/surface/call.cc","file_line":966,"grpc_message":"Cannot drop index `IDX_test_name_N_02FB4FA83F2E5433`. It is in use by foreign keys: `FK_test_test2_B0D0A324DEB5002B_1`.","grpc_status":9}"

For reference, here's the content of the first two migrations:

Initial generation :

def upgrade() -> None:
    op.create_table('test', sa.Column('id', sa.String(length=3), nullable=False), sa.PrimaryKeyConstraint('id'))

Add foreign key table:

def upgrade() -> None:
    op.create_table('test2',
    sa.Column('id', sa.String(length=3), nullable=False),
    sa.Column('name', sa.String(length=3), nullable=True),
    sa.PrimaryKeyConstraint('id')
    )
    with op.batch_alter_table('test', schema=None) as batch_op:
        batch_op.add_column(sa.Column('name', sa.String(length=3), nullable=True))
        batch_op.create_foreign_key(None, 'test2', ['name'], ['name'])
IlyaFaer commented 2 years ago

@louimet, instead of:

def upgrade() -> None:
    op.create_table(
        'test',
        sa.Column('id', sa.String(length=3), nullable=False),
        sa.PrimaryKeyConstraint('id')
    )

You can write this:

def upgrade() -> None:
    op.create_table(
        "test",
        sa.Column("id", sa.String(length=3), primary_key=True),
    )

It'll generate the same query with NOT NULL and PRIMARY KEY.

IlyaFaer commented 2 years ago

Running those two migrations works fine

Hm-m... On my end the second migration fails with the error:

The transaction was aborted and could not be retried due to a concurrent modification. UPDATE alembic_version SET version_num='7014eb0e9a2b' WHERE alembic_version.version_num = '072cf4e176ce'

Don't you see the same in the migration logs?

For a deeper analysis of the error root see this comment. Thus, I suspect here happens the same: migration is actually applied, but Alembic didn't update alembic_version table, so when you're trying to generate a new migration, Alembic thinks that foreign keys must be dropped, because they are in the base actually, but there is no record about what migration added them. If no migrations added them, than they must not exist - that's the logic I suppose.

Let's see if the solution proposed in this comment works.

louimet commented 2 years ago

Yes, that's the problem we were discussing in that other thread. I was using the emulator here.

But I just tried it online with the fix you suggested, and can reproduce the problem.

Also, the alembic_version table does show the correct version number (that of the second migration, the one that adds the second table and the foreign key), so that doesn't seem to be the problem.

IlyaFaer commented 2 years ago

@louimet, aha, I see the problem. Our dialect introspection method returns indexes, which are ruled by Spanner and should not be visible for users. Fixing it in #241

louimet commented 2 years ago

Awesome, thanks @IlyaFaer!

louimet commented 2 years ago

Hi,

I can still reproduce this issue in 1.2.1.

Generate a migration using these models.

class Test(Base):
    __tablename__ = 'test'
    id = Column(String, primary_key=True)

class RelatedTest(Base):
    __tablename__ = 'related_test'
    id = Column(String, primary_key=True)
    test_id = Column(String, ForeignKey(Test.id))

Run alembic upgrade head, then immediately generate another revision, and it contains a drop of the foreign key index.

IlyaFaer commented 2 years ago

@louimet, oh, yes. 1.2.1 was released in Aug, while the fixes were made in Sep. @asthamohta, it'd be good to cut a release with the last bunch of fixes.

louimet commented 2 years ago

Oh... Sorry, I'm confused. I was told that the version containing the fixes had been released. I thought it weird that it was dated back from August. Well, that applies to my comment in 232 as well then. I'll clarify the matter. Thanks!