googleapis / python-spanner-sqlalchemy

Apache License 2.0
39 stars 28 forks source link

NOT NULL statements are not working for generated Alembic migrations #232

Closed IlyaFaer closed 2 years ago

IlyaFaer commented 2 years ago

the following 2 statements in the Spanner DDL window do the trick : ALTER TABLE test ADD COLUMN test STRING(MAX); ALTER TABLE test ALTER COLUMN test STRING(MAX) NOT NULL;

But there is no native way of doing it in SQLAlchemy The workaround we currently use is to manually edit the migration file to add the two above statements using op.execute(). Tt would be nice if it worked out-of-the-box

IlyaFaer commented 2 years ago

But there is no native way of doing it in SQLAlchemy

To add a NOT NULL rule for a column, one can use the nullable argument of the Column() class like this:

Untitled

Thus, when creating a model, this argument can be used, and it'll be accepted by Alembic migration system.

ansh0l commented 2 years ago

@IlyaFaer : I'm not sure if this would work as a migration out of the box. What happens when you apply the migration against a Spanner database, does it work? It would be good to have an integration test cover it.

louimet commented 2 years ago

Hello! :)

We use declarative mapping to create our Alembic migrations using the --autogenerate switch.

engine = create_engine(settings.SPANNER_URL)
metadata = MetaData(bind=engine)
Base = declarative_base(metadata=metadata)

class Test(Base):
    __tablename__ = 'test'

    id = Column(String(3), primary_key=True)
    name = Column(String(3), nullable=False)

I created the table with only the id column at first. Adding the name column after produces the following migration:

def upgrade() -> None:    
    with op.batch_alter_table('test', schema=None) as batch_op:
        batch_op.add_column(sa.Column('name', sa.String(length=3), nullable=False))

Running this migration gives this error:

google.api_core.exceptions.MethodNotImplemented: 501 Cannot add NOT NULL column test.name to existing table test.

This is expected, as you can't add a non-primary key, non-null column. I went back and instead created the column with nullable=True without problem. Then as a subsequent migration flipped nullable to False. This produces the following migration:

def upgrade() -> None:
    with op.batch_alter_table('test', schema=None) as batch_op:
        batch_op.alter_column('name',
               existing_type=sa.String(length=3),
               nullable=False)

Which, when run, gives the following error:

INFO  [alembic.runtime.migration] Running upgrade  -> 67449554fba0, initial generation
INFO  [alembic.runtime.migration] Running upgrade 67449554fba0 -> 414083dfb827, add column
INFO  [alembic.runtime.migration] Running upgrade 414083dfb827 -> dbf54f1c8ad7, flip column to non-null
Traceback (most recent call last):
  File "/root/.local/lib/python3.9/site-packages/google/api_core/grpc_helpers.py", line 50, in error_remapped_callable
    return callable_(*args, **kwargs)
  File "/root/.local/lib/python3.9/site-packages/grpc/_channel.py", line 946, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/root/.local/lib/python3.9/site-packages/grpc/_channel.py", line 849, in _end_unary_response_blocking
    raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.INVALID_ARGUMENT
        details = "Error parsing Spanner DDL statement: ALTER TABLE test ALTER COLUMN name SET NOT NULL : Syntax error on line 1, column 40: Expecting 'OPTIONS' but found 'NOT'"
        debug_error_string = "{"created":"@1661269588.600907347","description":"Error received from peer ipv4:127.0.0.1:9010","file":"src/core/lib/surface/call.cc","file_line":966,"grpc_message":"Error parsing Spanner DDL statement: ALTER TABLE test ALTER COLUMN name SET NOT NULL : Syntax error on line 1, column 40: Expecting 'OPTIONS' but found 'NOT'","grpc_status":3}"
>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/root/.local/lib/python3.9/site-packages/google/cloud/spanner_dbapi/cursor.py", line 244, in execute
    self.connection.run_prior_DDL_statements()
  File "/root/.local/lib/python3.9/site-packages/google/cloud/spanner_dbapi/connection.py", line 53, in wrapper
    return function(connection, *args, **kwargs)
  File "/root/.local/lib/python3.9/site-packages/google/cloud/spanner_dbapi/connection.py", line 411, in run_prior_DDL_statements
    return self.database.update_ddl(ddl_statements).result()
  File "/root/.local/lib/python3.9/site-packages/google/cloud/spanner_v1/database.py", line 474, in update_ddl
    future = api.update_database_ddl(request=request, metadata=metadata)
  File "/root/.local/lib/python3.9/site-packages/google/cloud/spanner_admin_database_v1/services/database_admin/client.py", line 1033, in update_database_ddl
    response = rpc(
  File "/root/.local/lib/python3.9/site-packages/google/api_core/gapic_v1/method.py", line 154, in __call__
    return wrapped_func(*args, **kwargs)
  File "/root/.local/lib/python3.9/site-packages/google/api_core/retry.py", line 283, in retry_wrapped_func
    return retry_target(
  File "/root/.local/lib/python3.9/site-packages/google/api_core/retry.py", line 190, in retry_target
    return target()
  File "/root/.local/lib/python3.9/site-packages/google/api_core/grpc_helpers.py", line 52, in error_remapped_callable
    raise exceptions.from_grpc_error(exc) from exc
google.api_core.exceptions.InvalidArgument: 400 Error parsing Spanner DDL statement: ALTER TABLE test ALTER COLUMN name SET NOT NULL : Syntax error on line 1, column 40: Expecting 'OPTIONS' but found 'NOT'

This looks like the incorrect DDL statement is generated by alter_column(). Hence our current workaround : create the column with nullable=True, and then manually add this line to the migration.

op.execute('ALTER TABLE test ALTER COLUMN name STRING(3) NOT NULL')
IlyaFaer commented 2 years ago

Creating a not nullable column should be possible in one step by providing a default column value during migration - it'll be used to fill values, which are NULL in the old column in the moment of applying NOT NULL rule to a new column.

IlyaFaer commented 2 years ago

@louimet, what version of SQLAlchemy you're using?

I recall now, we had something like this issue already, and there was an override added here: https://github.com/googleapis/python-spanner-sqlalchemy/blob/8bab5fb33cb7d08905b5ec5c17ee8f26b848b359/google/cloud/sqlalchemy_spanner/sqlalchemy_spanner.py#L1018-L1027

(in the original method a word SET is used before NOT NULL, which creates problems in Spanner): https://github.com/sqlalchemy/alembic/blob/cfe92fac6794515d3aa3b995e288b11d5c9437fa/alembic/ddl/base.py#L204-L212

Still, it looks like the override doesn't work in your case. I wonder if changing: compiles(ColumnNullable, "spanner") to @compiles(ColumnNullable, "spanner+spanner") will work.

Also check env.py file, there the dialect name should be set to Spanner:

class SpannerImpl(DefaultImpl):
    __dialect__ = "spanner+spanner"

In case of SQLAlchemy 1.3 it must be simple "spanner".

louimet commented 2 years ago

We're on 1.4.40. I remember having to change the dialect from 'spanner' to 'spanner+spanner' when we upgraded from 1.3.

I'll try adding a default value.

louimet commented 2 years ago

I can't seem to make it work by specifying a default value.

Plus, we don't necessarily want a default value on all non-nullable columns. This is necessary when flipping an existing column to non-nullable, but when adding a fresh new non-nullable column, we only want it non-nullable, rejecting inserts that don't specify a value for this column.

Class used:

class Test(Base):
    __tablename__ = 'test'
    id = Column(String(3), primary_key=True)
    name = Column(String(3), nullable=False, server_default=text('def'))

(I also tried passing the default as a string literal without using text().)

Directly adding the non-nullable column with a default value:

def upgrade() -> None:    
    with op.batch_alter_table('test', schema=None) as batch_op:
        batch_op.add_column(sa.Column('name', sa.String(length=3), server_default=sa.text('def'), nullable=False))

gives this error :

INFO  [alembic.runtime.migration] Running upgrade  -> 67449554fba0, initial generation
INFO  [alembic.runtime.migration] Running upgrade 67449554fba0 -> e78530b9a485, add non-nullable column
Traceback (most recent call last):
  File "/root/.local/lib/python3.9/site-packages/google/api_core/grpc_helpers.py", line 50, in error_remapped_callable
    return callable_(*args, **kwargs)
  File "/root/.local/lib/python3.9/site-packages/grpc/_channel.py", line 946, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/root/.local/lib/python3.9/site-packages/grpc/_channel.py", line 849, in _end_unary_response_blocking
    raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.INVALID_ARGUMENT
        details = "Error parsing Spanner DDL statement: ALTER TABLE test ADD COLUMN name STRING(3) NOT NULL DEFAULT ('def') : Syntax error on line 1, column 53: Expecting 'EOF' but found 'DEFAULT'"
        debug_error_string = "{"created":"@1661352711.747002673","description":"Error received from peer ipv4:127.0.0.1:9010","file":"src/core/lib/surface/call.cc","file_line":966,"grpc_message":"Error parsing Spanner DDL statement: ALTER TABLE test ADD COLUMN name STRING(3) NOT NULL DEFAULT ('def') : Syntax error on line 1, column 53: Expecting 'EOF' but found 'DEFAULT'","grpc_status":3}"
>

Creating a nullable column without default value, then flipping to non-nullable and adding default :

def upgrade() -> None:    
    with op.batch_alter_table('test', schema=None) as batch_op:
        batch_op.alter_column('name',
               existing_type=sa.String(length=3),
               nullable=False)

First, there's no sign of a default value in the generated migration (I added it manually later). Running this gives this error:

INFO  [alembic.runtime.migration] Running upgrade  -> 67449554fba0, initial generation
INFO  [alembic.runtime.migration] Running upgrade 67449554fba0 -> 3ca6b8f063fa, add nullable column
INFO  [alembic.runtime.migration] Running upgrade 3ca6b8f063fa -> 4013e6ab72a2, flip to non-nullable
Traceback (most recent call last):
  File "/root/.local/lib/python3.9/site-packages/google/api_core/grpc_helpers.py", line 50, in error_remapped_callable
    return callable_(*args, **kwargs)
  File "/root/.local/lib/python3.9/site-packages/grpc/_channel.py", line 946, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/root/.local/lib/python3.9/site-packages/grpc/_channel.py", line 849, in _end_unary_response_blocking
    raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.INVALID_ARGUMENT
        details = "Error parsing Spanner DDL statement: ALTER TABLE test ALTER COLUMN name SET NOT NULL : Syntax error on line 1, column 40: Expecting 'OPTIONS' but found 'NOT'"
        debug_error_string = "{"created":"@1661352803.910627308","description":"Error received from peer ipv4:127.0.0.1:9010","file":"src/core/lib/surface/call.cc","file_line":966,"grpc_message":"Error parsing Spanner DDL statement: ALTER TABLE test ALTER COLUMN name SET NOT NULL : Syntax error on line 1, column 40: Expecting 'OPTIONS' but found 'NOT'","grpc_status":3}"
>

Manually adding 'server_default' to the migration file gives the same error again :

debug_error_string = "{"created":"@1661352970.249802787","description":"Error received from peer ipv4:127.0.0.1:9010","file":"src/core/lib/surface/call.cc","file_line":966,"grpc_message":"Error parsing Spanner DDL statement: ALTER TABLE test ADD COLUMN name STRING(3) DEFAULT ('def') : Syntax error on line 1, column 44: Expecting 'EOF' but found 'DEFAULT'","grpc_status":3}"

IlyaFaer commented 2 years ago

Okay, looks like I was able to reproduce the error finally:

ALTER TABLE test ALTER COLUMN name SET NOT NULL : Syntax error on line 1, column 40: Expecting 'OPTIONS' but found 'NOT'

Let's see now how to fix it...

IlyaFaer commented 2 years ago

@louimet, okay, locally I was able to make the failing migration work. As I suspected, it's about spanner and spanner+spanner.

One thing (looks like you already know this, and still) - when doing an ALTER, Spanner requires data type to always be mentioned. It causes a requirement for migrations to mention a current data type of the column except cases when data type is changed. This is a common restriction for several databases, see details.

Thus, existing_type should present in ALTER operations.

louimet commented 2 years ago

@IlyaFaer The existing type was already mentioned in the migration file (see here). Despite it being specified, the generated DDL doesn't seem to contain the type, as shown in the error message :

Error parsing Spanner DDL statement: ALTER TABLE test ALTER COLUMN name SET NOT NULL : Syntax error

But @ansh0l 's idea did the trick! I changed compiles(ColumnNullable, "spanner") to compiles(ColumnNullable, "spanner+spanner") and the same migration now passes without a hitch.

IlyaFaer commented 2 years ago

But @ansh0l 's idea did the trick!

A-ah, that's my comment you're mentioning actually,.. Anshul is another person. But okay, the main point it's working for you. I pushed a PR with the related changes yesterday with a test case of your problem and this spanner+spanner change: https://github.com/googleapis/python-spanner-sqlalchemy/pull/234

louimet commented 2 years ago

Oops! Sorry for getting you two mixed up. Thanks for the fix! :)