DanCardin / sqlalchemy-declarative-extensions

Library to declare additional kinds of objects not natively supported by SqlAlchemy/Alembic.
https://sqlalchemy-declarative-extensions.readthedocs.io/en/latest/
Apache License 2.0
33 stars 5 forks source link

feat: Add separate Procedure object. #65

Closed DanCardin closed 2 months ago

DanCardin commented 3 months ago

Fixes https://github.com/DanCardin/sqlalchemy-declarative-extensions/issues/63

At least with postgres as a reference, procedures seem to be essentially be a limited subset of a function, ~so the current impl introduces Procedures as a superclass from which Function is derived. I dont see any reason why this would be impractical to separate in the future, should it turn out to be necessary.

I was also considering treating them as separate objects, but their whole comparison infrastructure and everything would end up identical, and it wasn't clear why one would want to use one without the other. As such, for now at least, the PR is keeping procedures bundled under the "function" object category.~

They are implemented as a whole separate object type now. This helped typing a lot, as well as being potentially more logical for impling something like Snowflake, where they're wholly different objects. It's somewhat unfortunate for postgres because they share a namespace, but in the end this solves more problems than it generates.

coveralls commented 2 months ago

Pull Request Test Coverage Report for Build 9814772792

Details


Changes Missing Coverage Covered Lines Changed/Added Lines %
src/sqlalchemy_declarative_extensions/dialects/postgresql/function.py 33 37 89.19%
<!-- Total: 60 64 93.75% -->
Totals Coverage Status
Change from base Build 9781648865: 96.5%
Covered Lines: 2593
Relevant Lines: 2680

💛 - Coveralls
YaraslauZhylko commented 2 months ago

I tried out the currunt code and that's what I found so far:


  1. The change of the type (function -> procedure) is not detected if the name remains the same.

    I changed this definition:

    refresh_my_view_function = Function(
        "refresh_my_view",
        """
        BEGIN
            REFRESH MATERIALIZED VIEW my_view WITH DATA;
        END
        """,
        language="plpgsql",
        security=FunctionSecurity.definer,
    )

    to this:

    refresh_my_view_procedure = Procedure(
        "refresh_my_view",
        """
        BEGIN
            REFRESH MATERIALIZED VIEW my_view WITH DATA;
        END
        """,
        language="plpgsql",
        security=FunctionSecurity.definer,
    )

    The resulting migration looks like this:

    def upgrade() -> None:
        op.execute(
            """CREATE OR REPLACE PROCEDURE refresh_my_view() SECURITY DEFINER LANGUAGE plpgsql AS $$
        BEGIN
            REFRESH MATERIALIZED VIEW my_view WITH DATA;
        END
        $$;"""
        )
    
    def downgrade() -> None:
        op.execute(
            """CREATE OR REPLACE FUNCTION refresh_my_view() RETURNS void SECURITY DEFINER LANGUAGE plpgsql AS $$
        BEGIN
            REFRESH MATERIALIZED VIEW my_view WITH DATA;
        END
        $$;"""
        )

    Such a migration expectedly raises an error:

    sqlalchemy.exc.ProgrammingError: (sqlalchemy.dialects.postgresql.asyncpg.ProgrammingError) <class 'asyncpg.exceptions.WrongObjectTypeError'>: cannot change routine kind
    DETAIL:  "refresh_my_view" is a function.
    [SQL: CREATE OR REPLACE PROCEDURE refresh_my_view() SECURITY DEFINER LANGUAGE plpgsql AS $$
        BEGIN
            REFRESH MATERIALIZED VIEW my_view WITH DATA;
        END
        $$;]

    But if I change the name a bit, the migration is generated correctly:

    refresh_my_view_procedure = Procedure(
        "refresh_my_view_new",
        """
        BEGIN
            REFRESH MATERIALIZED VIEW my_view WITH DATA;
        END
        """,
        language="plpgsql",
        security=FunctionSecurity.definer,
    )
    def upgrade() -> None:
        op.execute(
            """CREATE PROCEDURE refresh_my_view_new() SECURITY DEFINER LANGUAGE plpgsql AS $$
        BEGIN
            REFRESH MATERIALIZED VIEW my_view WITH DATA;
        END
        $$;"""
        )
        op.execute("""DROP FUNCTION refresh_my_view();""")
    
    def downgrade() -> None:
        op.execute(
            """CREATE FUNCTION refresh_my_view() RETURNS void SECURITY DEFINER LANGUAGE plpgsql AS $$
                BEGIN
                    REFRESH MATERIALIZED VIEW my_view WITH DATA;
                END
                $$;"""
        )
        op.execute("""DROP PROCEDURE refresh_my_view_new();""")

    In that case the migration is executed successfully, and I'm able to CALL the newly credted procedure.

    So I guess we need to compare the routine type as well when doing the comparison.


  1. Running alembic check raises an error:

    Traceback (most recent call last):
      File "/home/yaraslau/projects/test/.venv/bin/alembic", line 8, in <module>
        sys.exit(main())
                 ^^^^^^
      File "/home/yaraslau/projects/test/.venv/lib/python3.12/site-packages/alembic/config.py", line 636, in main
        CommandLine(prog=prog).main(argv=argv)
      File "/home/yaraslau/projects/test/.venv/lib/python3.12/site-packages/alembic/config.py", line 626, in main
        self.run_cmd(cfg, options)
      File "/home/yaraslau/projects/test/.venv/lib/python3.12/site-packages/alembic/config.py", line 603, in run_cmd
        fn(
      File "/home/yaraslau/projects/test/.venv/lib/python3.12/site-packages/alembic/command.py", line 297, in check
        diffs.extend(upgrade_ops.as_diffs())
                     ^^^^^^^^^^^^^^^^^^^^^^
      File "/home/yaraslau/projects/test/.venv/lib/python3.12/site-packages/alembic/operations/ops.py", line 2572, in as_diffs
        return list(OpContainer._ops_as_diffs(self))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/yaraslau/projects/test/.venv/lib/python3.12/site-packages/alembic/operations/ops.py", line 2582, in _ops_as_diffs
        yield op.to_diff_tuple()
              ^^^^^^^^^^^^^^^^
    AttributeError: 'UpdateFunctionOp' object has no attribute 'to_diff_tuple'

  1. When generating a new migration, Alembic tries to repalce the procedure again:

    def upgrade() -> None:
        op.execute(
            """CREATE OR REPLACE PROCEDURE refresh_my_view_new() SECURITY DEFINER LANGUAGE plpgsql AS $$
        BEGIN
            REFRESH MATERIALIZED VIEW my_view WITH DATA;
        END
        $$;"""
        )
    
    def downgrade() -> None:
        op.execute(
            """CREATE OR REPLACE FUNCTION refresh_my_view_new() RETURNS void SECURITY DEFINER LANGUAGE plpgsql AS $$
        BEGIN
            REFRESH MATERIALIZED VIEW my_view WITH DATA;
        END
        $$;"""
        )

    Can this be related to an old issue of having a diff because of inconsistent leading whitespaces? :thinking:


  1. Can we make imports more consistent by exposing Procedure and FunctionSecurity in src/sqlalchemy_declarative_extensions/dialects/postgresql/__init__.py?

    So that we do not need to import like that:

    from sqlalchemy_declarative_extensions.dialects.postgresql import MaterializedOptions
    from sqlalchemy_declarative_extensions.dialects.postgresql.function import FunctionSecurity, Procedure

    and instead just do this:

    from sqlalchemy_declarative_extensions.dialects.postgresql import FunctionSecurity, MaterializedOptions, Procedure

  1. Why do we need to add procedures to function definitons?

    Base.functions.append(refresh_my_view_procedure)

    Why can't we have Procedures and Functions containers separated for more readability?

    class Base(DeclarativeBase):
        functions = Functions()
        procedures = Procedures()
        triggers = Triggers()
    
    Base.functions.append(some_function)
    Base.procedures.append(refresh_my_view_procedure)

  1. Same question goes for FunctionSecurity enum.

    This code seems a bit weird:

    refresh_my_view_procedure = Procedure(  # <-----
        "refresh_my_view",
        "...",
        language="plpgsql",
        security=FunctionSecurity.definer,  # <-----
    )

    This would have looked much more readable:

    refresh_my_view_procedure = Procedure(  # <-----
        "refresh_my_view",
        "...",
        language="plpgsql",
        security=ProcedureSecurity.definer,  # <-----
    )

    Maybe have the same enum exposed under two different name? :thinking:

    Or have some universal one like RoutineSecurity:

    refresh_my_view_procedure = Procedure(  # <-----
        "refresh_my_view",
        "...",
        language="plpgsql",
        security=RoutineSecurity.definer,  # <-----
    )

  1. Should functions and procedures actually be separated into two different modules?

    Like:

    • src/sqlalchemy_declarative_extensions/dialects/postgresql/function.py
    • src/sqlalchemy_declarative_extensions/dialects/postgresql/procedure.py
DanCardin commented 2 months ago

I kept them together because function inherits from procedure and they're not meaningfully different from the perspective of the lib. And further in postgres they're saved to the same pg table so it seemed easiest to colocate them. However, i just took a gander at, for example, snowflake and they appear to be wholly different, and perhaps it does make sense to separate them...

Although i'm perhaps starting to believe i could/should just fully separate them. it'd address items 1/6/7/8 de-facto

DanCardin commented 2 months ago

As for alembic check. I wasn't aware of this command slash assumed interface. It doesn't work for any of the existing object types. I can look into that separately.

YaraslauZhylko commented 2 months ago

As for alembic check. I wasn't aware of this command slash assumed interface. It doesn't work for any of the existing object types. I can look into that separately.

It did work for us before with triggers, functions and views.

I guess trying out Procedure in this PR was the first time it failed.

YaraslauZhylko commented 2 months ago

@DanCardin Please ping me when it is ready to be tested again.

coveralls commented 2 months ago

Pull Request Test Coverage Report for Build 9849441494

Details


Changes Missing Coverage Covered Lines Changed/Added Lines %
src/sqlalchemy_declarative_extensions/dialects/postgresql/procedure.py 36 37 97.3%
src/sqlalchemy_declarative_extensions/procedure/base.py 50 52 96.15%
src/sqlalchemy_declarative_extensions/procedure/compare.py 57 59 96.61%
<!-- Total: 223 228 97.81% -->
Totals Coverage Status
Change from base Build 9781648865: 96.6%
Covered Lines: 2761
Relevant Lines: 2850

💛 - Coveralls
DanCardin commented 2 months ago

@YaraslauZhylko should be good now. For some of your bulleted items.

  1. I wouldn't expect this to be fixed per-se. Converting a procedure to a function might work without edit, but the reverse direction will probably yield the procedure drop after the function create which will fail (until you reorder the statements). this is fairly common even with vanilla alembic, so i dont really consider it a blocker, even if i might still try to fix it eventually.
  2. alembic check wont work, nor does it work anywhere else. i'll raise that as a separate issue.
  3. shouldnt happen in the general case, but isn't foolproof. it's not safe to manipulate unknown function body content beyond the basic normalization i do today. 5/6/7/8. should all be fixed
YaraslauZhylko commented 2 months ago

@DanCardin Actually, everything is fixed now. :upside_down_face:

  1. Alambic now correctly determins the change of the routine type from function to procedure without renaming it. :tada:

    def upgrade() -> None:
        op.execute("""DROP FUNCTION refresh_my_view();""")
        op.execute(
            """CREATE PROCEDURE refresh_my_view() SECURITY DEFINER LANGUAGE plpgsql AS $$
        BEGIN
             REFRESH MATERIALIZED VIEW my_view WITH DATA;
        END
        $$;"""
        )
    
    def downgrade() -> None:
        op.execute("""DROP PROCEDURE refresh_my_view();""")
        op.execute(
            """CREATE FUNCTION refresh_my_view() RETURNS void SECURITY DEFINER LANGUAGE plpgsql AS $$
                BEGIN
                    REFRESH MATERIALIZED VIEW my_view WITH DATA;
                END
                $$;"""
        )
  2. Even though the indentation is still a bit weird (probably because of the post-formatting in the original migration), Alembic does detect it correctly, and does not generate another migration if I run alembic revision for the second time. Even if I update the indentation/formatting manually in the migration file.

  3. alembic check does work now:

    No new upgrade operations detected.
    FAILED: Target database is not up to date.
  4. All imports look pretty and logical now.

Thanks!