sqlalchemy.ext.serializer fail with column_property in query condition

MikhailZhukov commented 4 years ago

Deserialization of query with column_property fail with error

Traceback (most recent call last):
  File "/home/mzhukov/projects/test_sqa/test_sql/main.py", line 53, in <module>
    assert str(orig_req) == str(deserialized_obj)
  File "/home/mzhukov/.local/share/virtualenvs/test_sqa-05qMtTeV/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3379, in __str__
    return str(context.statement.compile(bind))
  File "<string>", line 1, in <lambda>
  File "/home/mzhukov/.local/share/virtualenvs/test_sqa-05qMtTeV/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 462, in compile
    return self._compiler(dialect, bind=bind, **kw)
  File "/home/mzhukov/.local/share/virtualenvs/test_sqa-05qMtTeV/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 468, in _compiler
    return dialect.statement_compiler(dialect, self, **kw)
  File "/home/mzhukov/.local/share/virtualenvs/test_sqa-05qMtTeV/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 571, in __init__
    Compiled.__init__(self, dialect, statement, **kwargs)
  File "/home/mzhukov/.local/share/virtualenvs/test_sqa-05qMtTeV/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 319, in __init__
    self.string = self.process(self.statement, **compile_kwargs)
  File "/home/mzhukov/.local/share/virtualenvs/test_sqa-05qMtTeV/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 350, in process
    return obj._compiler_dispatch(self, **kwargs)
  File "/home/mzhukov/.local/share/virtualenvs/test_sqa-05qMtTeV/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 92, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/mzhukov/.local/share/virtualenvs/test_sqa-05qMtTeV/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 2140, in visit_select
    text, select, inner_columns, froms, byfrom, kwargs
  File "/home/mzhukov/.local/share/virtualenvs/test_sqa-05qMtTeV/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 2239, in _compose_select_body
    t = select._whereclause._compiler_dispatch(self, **kwargs)
  File "/home/mzhukov/.local/share/virtualenvs/test_sqa-05qMtTeV/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 92, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/mzhukov/.local/share/virtualenvs/test_sqa-05qMtTeV/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1299, in visit_binary
    return self._generate_generic_binary(binary, opstring, **kw)
  File "/home/mzhukov/.local/share/virtualenvs/test_sqa-05qMtTeV/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1346, in _generate_generic_binary
    + binary.right._compiler_dispatch(
  File "/home/mzhukov/.local/share/virtualenvs/test_sqa-05qMtTeV/lib/python3.7/site-packages/sqlalchemy/sql/annotation.py", line 79, in _compiler_dispatch
    return self.__element.__class__._compiler_dispatch(self, visitor, **kw)
  File "/home/mzhukov/.local/share/virtualenvs/test_sqa-05qMtTeV/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 92, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/mzhukov/.local/share/virtualenvs/test_sqa-05qMtTeV/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 836, in visit_label
    self, within_columns_clause=False, **kw
  File "/home/mzhukov/.local/share/virtualenvs/test_sqa-05qMtTeV/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 92, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/mzhukov/.local/share/virtualenvs/test_sqa-05qMtTeV/lib/python3.7/site-packages/sqlalchemy/dialects/sqlite/base.py", line 990, in visit_cast
    return super(SQLiteCompiler, self).visit_cast(cast, **kwargs)
  File "/home/mzhukov/.local/share/virtualenvs/test_sqa-05qMtTeV/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1015, in visit_cast
    cast.clause._compiler_dispatch(self, **kwargs),
  File "/home/mzhukov/.local/share/virtualenvs/test_sqa-05qMtTeV/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 92, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/mzhukov/.local/share/virtualenvs/test_sqa-05qMtTeV/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1119, in visit_function
    ) % {"expr": self.function_argspec(func, **kwargs)}
  File "/home/mzhukov/.local/share/virtualenvs/test_sqa-05qMtTeV/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1131, in function_argspec
    return func.clause_expr._compiler_dispatch(self, **kwargs)
  File "/home/mzhukov/.local/share/virtualenvs/test_sqa-05qMtTeV/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 92, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/mzhukov/.local/share/virtualenvs/test_sqa-05qMtTeV/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 727, in visit_grouping
    return "(" + grouping.element._compiler_dispatch(self, **kwargs) + ")"
  File "/home/mzhukov/.local/share/virtualenvs/test_sqa-05qMtTeV/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 92, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/mzhukov/.local/share/virtualenvs/test_sqa-05qMtTeV/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 983, in visit_clauselist
    c._compiler_dispatch(self, **kw) for c in clauselist.clauses
  File "/home/mzhukov/.local/share/virtualenvs/test_sqa-05qMtTeV/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 981, in <genexpr>
    s
  File "/home/mzhukov/.local/share/virtualenvs/test_sqa-05qMtTeV/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 983, in <genexpr>
    c._compiler_dispatch(self, **kw) for c in clauselist.clauses
  File "/home/mzhukov/.local/share/virtualenvs/test_sqa-05qMtTeV/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 92, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/mzhukov/.local/share/virtualenvs/test_sqa-05qMtTeV/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1503, in visit_bindparam
    % bindparam.key
sqlalchemy.exc.CompileError: Bind parameter '%(140081294316096 left)s' conflicts with unique bind parameter of the same name

I test this error with Python3.7 + SQLAlchemy 1.3.12 In Python2.7 + SQLAlchemy 1.1.18 deserialization does not work too, but with other exception

For reproducing this bug I make simple test file:

# -*- coding: utf-8 -*-
import pickle
import sys
import traceback

from sqlalchemy import create_engine, Integer, Column, String, exc as sa_exc
from sqlalchemy.ext import serializer
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.ext.hybrid import hybrid_property
from sqlalchemy.orm import sessionmaker, scoped_session, column_property
from sqlalchemy.sql import func

Base = declarative_base()
db_engine = create_engine('sqlite:///:memory:')
Session = scoped_session(sessionmaker(bind=db_engine))

class TestTable(Base):
    __tablename__ = 'test'

    id = Column(Integer, primary_key=True, autoincrement=True)
    _some_id = Column('some_id', String)
    some_primary_id = column_property(func.left(_some_id, 6).cast(Integer))

class SecondTestTable(Base):
    __tablename__ = 'test2'

    id = Column(Integer, primary_key=True, autoincrement=True)
    _some_id = Column('some_id', String)

    @hybrid_property
    def some_primary_id(self):
        return int(self._some_id[:6])

    @some_primary_id.expression
    def some_primary_id(cls):
        return func.left(cls._some_id, 6).cast(Integer)

#  Create tables and two records
Base.metadata.create_all(bind=db_engine)
db_session = Session()
db_session.add(TestTable(_some_id="1234567890"))
db_session.add(SecondTestTable(_some_id="1234567890"))
db_session.commit()

#  Query without column_property in condition work
orig_req = db_session.query(TestTable)
serialised_obj = serializer.dumps(orig_req, pickle.HIGHEST_PROTOCOL)
deserialized_obj = serializer.loads(serialised_obj, metadata=Base.metadata, scoped_session=Session)
assert str(orig_req) == str(deserialized_obj)

#  Query with column_property in condition fail
orig_req = db_session.query(TestTable).filter(TestTable.some_primary_id == 123456)
serialised_obj = serializer.dumps(orig_req, pickle.HIGHEST_PROTOCOL)
deserialized_obj = serializer.loads(serialised_obj, metadata=Base.metadata, scoped_session=Session)
try:
    assert str(orig_req) == str(deserialized_obj)
except sa_exc.SQLAlchemyError as exc:
    print('Error: %r' % exc)
    traceback.print_exc(file=sys.stdout)

#  Query with hybrid_property in condition work
orig_req = db_session.query(SecondTestTable).filter(SecondTestTable.some_primary_id == 123456)
serialised_obj = serializer.dumps(orig_req, pickle.HIGHEST_PROTOCOL)
deserialized_obj = serializer.loads(serialised_obj, metadata=Base.metadata, scoped_session=Session)
assert str(orig_req) == str(deserialized_obj)

zzzeek commented 4 years ago

hi there -

thanks for the clear test case.

unfortunately this issue is an illustration of a problem in the "serializer" extension that isn't really solvable in a generalized way. The issue here is that the query against TestTable, when deserialized, uses the TestTable class that's in the script as well as the TestTable.some_primary_id object when it renders columns clause of the query. However, the WHERE clause is using the deserialized version of TestTable.some_primary_id. The bound parameter, when presented with itself as well as it's cloned self, causes a conflict because they have the same name.

The bound parameter object could try to work around this by changing its "unique" name when deserialized, i'll probably just commit that, which does fix this specific issue and is:


diff --git a/lib/sqlalchemy/sql/elements.py b/lib/sqlalchemy/sql/elements.py
index 422eb6220..54981375a 100644
--- a/lib/sqlalchemy/sql/elements.py
+++ b/lib/sqlalchemy/sql/elements.py
@@ -1386,6 +1386,13 @@ class BindParameter(roles.InElementRole, ColumnElement):
         d["value"] = v
         return d

+    def __setstate__(self, state):
+        if state.get('unique', False):
+            state['key'] = _anonymous_label(
+                "%%(%d %s)s" % (id(self), state.get('_orig_key', 'param'))
+            )
+        self.__dict__.update(state)
+
     def __repr__(self):
         return "BindParameter(%r, %r, type_=%r)" % (
             self.key,

I can't be sure what other unwanted effects this may have in other areas that deal with serialization but I will have to hope that there is very little serialize/deserialize of queries going on in real world applications in general.

In fact I hope to deprecate and remove the "serializer" extension in any case as there are a lot more things it doesn't really do correctly. Can I ask what your use case for the "serializer" is? AFAIK there aren't really any good uses for it and it should not have been added.

sqla-tester commented 4 years ago

Mike Bayer has proposed a fix for this issue in the master branch:

Alter unique bound parameter key on deserialize https://gerrit.sqlalchemy.org/1660

sqla-tester commented 4 years ago

Mike Bayer has proposed a fix for this issue in the rel_1_3 branch:

Alter unique bound parameter key on deserialize https://gerrit.sqlalchemy.org/1661

tribals commented 4 years ago

I used serializer for implementing pagination for REST API. First request arrives, I build query, then serialize and cache it. When client want next chunk, I deserialize query, adjust limit/offset and execute it.

The reason why I chosen this way is that queries can be very complicated, and context (attributes and values, joins, etc) will be lost when requesting next chunk.

MikhailZhukov commented 4 years ago

Hi.

Thanks a lot for helping.

In fact I hope to deprecate and remove the "serializer" extension in any case as there are a lot more things it doesn't really do correctly. Can I ask what your use case for the "serializer" is? AFAIK there aren't really any good uses for it and it should not have been added.

Thanks for information about deprecating serializer. We will think about replacement it in our pagination implementation.

zzzeek commented 4 years ago

is it possible that instaed of caching the completed query, you cache the information from the REST request that is used to build up that query? this would be much more portable.

MikhailZhukov commented 4 years ago

yes, i think we will do that in nearest future. I replaced all column_property to hybrid_property now and it look like it work good enough while we will refactor code base

sqlalchemy / sqlalchemy

sqlalchemy.ext.serializer fail with column_property in query condition #5086