kvesteri / postgresql-audit

Audit trigger for PostgreSQL
BSD 2-Clause "Simplified" License
126 stars 28 forks source link

Trying to replacing SQLAlchemy-Continuum #32

Open djlambert opened 6 years ago

djlambert commented 6 years ago

I'm trying to replace SQLAlchemy-Continuum in my project with postgresql-audit but must be missing something.

When I replace the call to make_versioned with versioning_manager.init(Base) and try and create the schema it tries to use the audit_table function before it's defined.

Traceback (most recent call last):
  File "/Users/dereklambert/Development/audit-tools3/venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1193, in _execute_context
    context)
  File "/Users/dereklambert/Development/audit-tools3/venv/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 507, in do_execute
    cursor.execute(statement, parameters)
psycopg2.ProgrammingError: function audit_table(unknown) does not exist
LINE 1: SELECT audit_table('company') AS audit_table_1
               ^
HINT:  No function matches the given name and argument types. You might need to add explicit type casts.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "populate.py", line 30, in <module>
    database_manager.create_schema()
  File "/Users/dereklambert/Development/audit-tools3/audit_tools/manager/database_manager.py", line 88, in create_schema
    Base.metadata.create_all(self.engine)
  File "/Users/dereklambert/Development/audit-tools3/venv/lib/python3.6/site-packages/sqlalchemy/sql/schema.py", line 4004, in create_all
    tables=tables)
  File "/Users/dereklambert/Development/audit-tools3/venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1940, in _run_visitor
    conn._run_visitor(visitorcallable, element, **kwargs)
  File "/Users/dereklambert/Development/audit-tools3/venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1549, in _run_visitor
    **kwargs).traverse_single(element)
  File "/Users/dereklambert/Development/audit-tools3/venv/lib/python3.6/site-packages/sqlalchemy/sql/visitors.py", line 121, in traverse_single
    return meth(obj, **kw)
  File "/Users/dereklambert/Development/audit-tools3/venv/lib/python3.6/site-packages/sqlalchemy/sql/ddl.py", line 757, in visit_metadata
    _is_metadata_operation=True)
  File "/Users/dereklambert/Development/audit-tools3/venv/lib/python3.6/site-packages/sqlalchemy/sql/visitors.py", line 121, in traverse_single
    return meth(obj, **kw)
  File "/Users/dereklambert/Development/audit-tools3/venv/lib/python3.6/site-packages/sqlalchemy/sql/ddl.py", line 810, in visit_table
    _is_metadata_operation=_is_metadata_operation)
  File "/Users/dereklambert/Development/audit-tools3/venv/lib/python3.6/site-packages/sqlalchemy/event/attr.py", line 284, in __call__
    fn(*args, **kw)
  File "/Users/dereklambert/Development/audit-tools3/venv/lib/python3.6/site-packages/postgresql_audit/base.py", line 33, in __call__
    bind.execute(self.stmt)
  File "/Users/dereklambert/Development/audit-tools3/venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 948, in execute
    return meth(self, multiparams, params)
  File "/Users/dereklambert/Development/audit-tools3/venv/lib/python3.6/site-packages/sqlalchemy/sql/elements.py", line 269, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/Users/dereklambert/Development/audit-tools3/venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1060, in _execute_clauseelement
    compiled_sql, distilled_params
  File "/Users/dereklambert/Development/audit-tools3/venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1200, in _execute_context
    context)
  File "/Users/dereklambert/Development/audit-tools3/venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1413, in _handle_dbapi_exception
    exc_info
  File "/Users/dereklambert/Development/audit-tools3/venv/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 203, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/Users/dereklambert/Development/audit-tools3/venv/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 186, in reraise
    raise value.with_traceback(tb)
  File "/Users/dereklambert/Development/audit-tools3/venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1193, in _execute_context
    context)
  File "/Users/dereklambert/Development/audit-tools3/venv/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 507, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.ProgrammingError: (psycopg2.ProgrammingError) function audit_table(unknown) does not exist
LINE 1: SELECT audit_table('company') AS audit_table_1
               ^
HINT:  No function matches the given name and argument types. You might need to add explicit type casts.
 [SQL: 'SELECT audit_table(%(audit_table_2)s) AS audit_table_1'] [parameters: {'audit_table_2': 'company'}] (Background on this error at: http://sqlalche.me/e/f405)

I decided to try something simple to test the functionality, but no activity records are saved (similar to #26)

import sqlalchemy as sa
import sqlalchemy.orm as orm
import sqlalchemy_utils as utils
from sqlalchemy.ext.declarative import declarative_base
from postgresql_audit import VersioningManager

Base = declarative_base()

versioning_manager = VersioningManager()
versioning_manager.init(Base)

class Role(Base):
    name        = sa.Column(sa.String, primary_key=True)
    description = sa.Column(sa.String)

    __tablename__ = 'role'
    __versioned__ = {}

engine_url = 'postgresql+psycopg2://postgres@localhost/pyaudit'

if utils.database_exists(engine_url):
    utils.drop_database(engine_url)

utils.create_database(engine_url)

engine = sa.create_engine(engine_url, echo=True)

Base.metadata.create_all(engine)

session = orm.sessionmaker(bind=engine)()

role = Role(name='test')
session.add(role)
session.commit()

Activity = versioning_manager.activity_cls

activity = session.query(Activity).first()
print(activity)

I'd really like to get this working so I can test the performance against SQLAlchemy-Continuum before putting my project into production. I presume it'll be difficult to migrate after the fact.

djlambert commented 6 years ago

Did some digging through the tests and added the following code before creating all the tables:

orm.configure_mappers()

versioning_manager.transaction_cls.__table__.create(engine)
versioning_manager.activity_cls.__table__.create(engine)

This seems to have fixed the issue. Is this something missing from the documentation?

k-dal commented 5 years ago

I had the same issue, and this solved my problem, too. Base.metadata.create_all(engine) tries to use the audit_table function before it's created, but this is an easy workaround.

Thanks for commenting with your solution, @djlambert - you saved me on this one.

rogerio-geru commented 5 years ago

Hey @djlambert and @kevindalias !

Have you had the chance to compare the performance against continuum? Is it really better? Currently I'm using continuum in Postgresql without native_versioning, and working towards implementing it (but there's several bugs stopping me, like https://github.com/kvesteri/sqlalchemy-continuum/issues/117 ).

Also, did you manage to migrate all the data to the new unique table in postgresql-audit? Is it even possible?

Thank you!

k-dal commented 5 years ago

@rogerio-geru I've used continuum extensively in the past, and I can confidently tell you postgresql-audit is dramatically faster. There's just no way for Python to compete with native Postgres triggers. If speed is a problem for you, the switch is absolutely worth it. Theoretically native-versioning in continuum would offer a similar benefit, but I never got that working for the same reasons you mentioned.

I didn't migrate the old data from continuum, but I definitely think it's possible. It'd just take a hell of a lot of careful work and testing to get it right. So it lands in that gray area of cost/benefit. In my case, dropping the old data was actually a benefit because it'd become so enormous and unwieldy with continuum.

I've also noticed a number of nice side effects of switching, especially around type handling. I used to have problems with continuum confusing actual changes and type changes (e.g., it was picking up date changes because some modules were sending '2019-01-01' strings instead of datetime objects). Similar issues were occurring with boolean columns (False vs '0' vs 0 in Python). That went away with postgresql-audit since Postgres is in charge.

I know I gave a lot of non-answers, but I hope this helps you make a good decision.

rogerio-geru commented 5 years ago

Thank you for your comments @kevindalias! I think I will try it, and leaving behind the history for now, like you did.

But for the moment, I'm struggling a little. Where is the documentation @kvesteri? Isn't there any? I've studied the code, it's very nice, but what's the difference between statement and row level for instance? There isn't even docstrings to explain several parts.

I'll try it next week, but I'm apprehensive atm.

kvesteri commented 5 years ago

The difference if the type of triggers used for storing changes. For the majority of cases statement level triggers are by far faster than row level triggers.

Statement level trigger for an UPDATE that updates 1000 rows makes only a single version history INSERT whereas row level trigger would issue 1000 inserts.

rogerio-geru commented 5 years ago

Thank you @kvesteri, I see. But what's the format of this statement level Activity, what would the changed_data be?

Another, more important thing, you have changed the interface from Continuum, for example models do not get a .versions method anymore, and each activity, which is a version instance, do not have a .changeset method, which lists changed_data as

{
    'name': ['old', 'new']
}

And my code needs these methods. Any advice on how could I reintroduce them? Will I have to fork your repo and implement them in mine?

rogerio-geru commented 5 years ago

Well. I've just stumbled upon a very big problem... I don't know if I'm missing something, please help @kvesteri or @kevindalias !

All of my company models use multi-table inheritance in sqlalchemy, so for example I have this:

Base = declarative_base()

class Part(Base):
    __tablename__ = 'part'

class Person(Part):
    __tablename__ = 'person'

class Profile(Person):
    __tablename__ = 'profile'

class Borrower(Profile):
    __tablename__ = 'borrower'

When I retrieve my first versions, it'll return all of them apart:

In [7]: from postgresql_audit import versioning_manager

In [8]: Activity = versioning_manager.activity_cls

In [9]: Activity.query.all()
Out[9]: 
[<Activity table_name=u'user' id=1L>,
 <Activity table_name=u'account' id=2L>,
 <Activity table_name=u'part' id=3L>,
 <Activity table_name=u'person' id=4L>,
 <Activity table_name=u'profile' id=5L>,
 <Activity table_name=u'borrower' id=6L>]

How would I retrieve the actual changed data of the Borrower in question? Doesn't postgresql-audit merge the involved activities in all tables?

k-dal commented 5 years ago

It's hard to be sure without more context, but I did notice one particularly important omission in the snippets above: versioning_manager.init(Base)

There's more info on the SQLAlchemy integration here: https://postgresql-audit.readthedocs.io/en/stable/sqlalchemy.html - I found it pretty useful as a basic reference.