cosmicpython / book

A Book about Pythonic Application Architecture Patterns for Managing Complexity. Cosmos is the Opposite of Chaos you see. O'R. wouldn't actually let us call it "Cosmic Python" tho.
https://www.cosmicpython.com
Other
3.39k stars 527 forks source link

Chapter 6: SERIALIZABLE VS version_number #139

Open haibin opened 4 years ago

haibin commented 4 years ago

One question: If postgres has already implemented non-locking based SERIALIZABLE (https://drkp.net/papers/ssi-vldb12.pdf), why do we need additional version_number in the application? What would happen without version_number?

hjwp commented 4 years ago

I'm not sure -- would serializable ensure that no two transactions can update the allocations table at the same time? how?

adding that extra field to the products table, and making any transactions attempt to modify that single place, is what actually makes serializable work, i think.

why not try it? the integration test that uses threads could be rewritten to check on allocations, instead of checking on version numbers...

haibin commented 4 years ago

I think serializable should take care of all concurrency issues so that the application does not need to worry about it. But I'll try it for sure.

hjwp commented 4 years ago

you know, i think you might be right. to load up existing allocations for a product, we have to do a SELECT FROM allocations WHERE sku=xyz, and then when we add a new allocation we do an INSERT into that same table, which would make any concurrent attempts to do that same SELECT invalid. so, yep, SERIALIZABLE will work i think, with no version number.

i think! but the only way to be really sure is to try it ;-)

haibin commented 4 years ago

I checked out the chapter_06_aggregate branch, commented out the version_number and ran test_concurrent_updates_to_version_are_not_allowed. It still threw out this exception sqlalchemy.exc.OperationalError: (psycopg2.errors.SerializationFailure) could not serialize access due to read/write dependencies among transactions. So I'm wondering why we need this version_number?

$ make integration-tests
docker-compose run --rm --entrypoint='pytest -s /tests/integration/test_uow.py::test_concurrent_updates_to_version_are_not_allowed' app
Starting python-leap-code_postgres_1 ... done
=========================================================================================== test session starts ============================================================================================
platform linux -- Python 3.8.0, pytest-5.3.0, py-1.8.0, pluggy-0.13.1
rootdir: /tests, inifile: pytest.ini
plugins: icdiff-0.2
collected 1 item                                                                                                                                                                                           

../tests/integration/test_uow.py Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1245, in _execute_context
    self.dialect.do_execute(
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 581, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.SerializationFailure: could not serialize access due to read/write dependencies among transactions
DETAIL:  Reason code: Canceled on identification as a pivot, during conflict in checking.
HINT:  The transaction might succeed if retried.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/tests/integration/test_uow.py", line 83, in try_to_allocate
    uow.commit()
  File "/src/allocation/unit_of_work.py", line 51, in commit
    self.session.commit()
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 1027, in commit
    self.transaction.commit()
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 494, in commit
    self._prepare_impl()
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 473, in _prepare_impl
    self.session.flush()
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 2470, in flush
    self._flush(objects)
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 2608, in _flush
    transaction.rollback(_capture_exception=True)
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 68, in __exit__
    compat.reraise(exc_type, exc_value, exc_tb)
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 153, in reraise
    raise value
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 2568, in _flush
    flush_context.execute()
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/unitofwork.py", line 422, in execute
    rec.execute(self)
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/unitofwork.py", line 540, in execute
    self.dependency_processor.process_saves(uow, states)
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/dependency.py", line 1176, in process_saves
    self._run_crud(
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/dependency.py", line 1239, in _run_crud
    connection.execute(statement, secondary_insert)
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 982, in execute
    return meth(self, multiparams, params)
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/sql/elements.py", line 287, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1095, in _execute_clauseelement
    ret = self._execute_context(
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1249, in _execute_context
    self._handle_dbapi_exception(
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1476, in _handle_dbapi_exception
    util.raise_from_cause(sqlalchemy_exception, exc_info)
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 398, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 152, in reraise
    raise value.with_traceback(tb)
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1245, in _execute_context
    self.dialect.do_execute(
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 581, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.OperationalError: (psycopg2.errors.SerializationFailure) could not serialize access due to read/write dependencies among transactions
DETAIL:  Reason code: Canceled on identification as a pivot, during conflict in checking.
HINT:  The transaction might succeed if retried.

[SQL: INSERT INTO allocations (orderline_id, batch_id) VALUES (%(orderline_id)s, %(batch_id)s) RETURNING allocations.id]
[parameters: {'orderline_id': 41, 'batch_id': 33}]
(Background on this error at: http://sqlalche.me/e/e3q8)
hjwp commented 4 years ago

Nice investigation!

One thing the version number maybe does give us is that SELECT FOR UPDATE will work, I think? You might have reasons to prefer that to ISOLATION=SERIALIZABLE (performane maybe?).

I think the real reason is to make an implicit concept explicit... I'll talk to @bobthemighty about it a bit more...