yesodweb / persistent

Persistence interface for Haskell allowing multiple storage methods.
MIT License
467 stars 297 forks source link

weird behaviour of runSqlPoolNoTransaction on persistent-mysql #1453

Open fumieval opened 1 year ago

fumieval commented 1 year ago

Currently autocommit is disabled here: https://github.com/yesodweb/persistent/blob/b8484fa8327358f5770d52f6243b9cc5c0861122/persistent-mysql/Database/Persist/MySQL.hs#L135

This means that runSqlPoolNoTransaction does not do what most users would expect; instead it leaves uncommitted changes. Is this intended? My understanding is that there is no need to disable autocommit because START TRANSACTION is called explicitly here

parsonsmatt commented 1 year ago

Huh. That line of code was written in 2012 with the first commit of persistent-mysql, which also includes the start transaction command as part of begin.

The docs for runSqlPoolNoTransaction say:

-- | Like 'runSqlPool', but does not surround the action in a transaction.
-- This action might leave your database in a weird state.
--
-- @since 2.12.0.0

MySQL documentation link, citing some relevant stuff:

A session that has autocommit enabled can perform a multiple-statement transaction by starting it with an explicit START TRANSACTION or BEGIN statement and ending it with a COMMIT or ROLLBACK statement. See Section 13.3.1, “START TRANSACTION, COMMIT, and ROLLBACK Statements”.

If autocommit mode is disabled within a session with SET autocommit = 0, the session always has a transaction open. A COMMIT or ROLLBACK statement ends the current transaction and a new one starts.

If a session that has autocommit disabled ends without explicitly committing the final transaction, MySQL rolls back that transaction.

Some statements implicitly end a transaction, as if you had done a COMMIT before executing the statement. For details, see Section 13.3.3, “Statements That Cause an Implicit Commit”.

A COMMIT means that the changes made in the current transaction are made permanent and become visible to other sessions. A ROLLBACK statement, on the other hand, cancels all modifications made by the current transaction. Both COMMIT and ROLLBACK release all InnoDB locks that were set during the current transaction.


runSqlPool uses the underlying SqlBackend and calls the relevant transaction stuff:

runSqlPool
    :: forall backend m a. (MonadUnliftIO m, BackendCompatible SqlBackend backend)
    => ReaderT backend m a -> Pool backend -> m a
runSqlPool r pconn = do
    rawRunSqlPool r pconn Nothing

rawRunSqlPool
    :: forall backend m a. (MonadUnliftIO m, BackendCompatible SqlBackend backend)
    => ReaderT backend m a -> Pool backend -> Maybe IsolationLevel -> m a
rawRunSqlPool r pconn mi =
    runSqlPoolWithHooks r pconn mi before after onException
  where
    before conn = do
        let sqlBackend = projectBackend conn
        let getter = getStmtConn sqlBackend
        liftIO $ connBegin sqlBackend getter mi
    after conn = do
        let sqlBackend = projectBackend conn
        let getter = getStmtConn sqlBackend
        liftIO $ connCommit sqlBackend getter
    onException conn _ = do
        let sqlBackend = projectBackend conn
        let getter = getStmtConn sqlBackend
        liftIO $ connRollback sqlBackend getter

runSqlPoolWithHooks
    :: forall backend m a before after onException. (MonadUnliftIO m, BackendCompatible SqlBackend backend)
    => ReaderT backend m a
    -> Pool backend
    -> Maybe IsolationLevel
    -> (backend -> m before)
    -- ^ Run this action immediately before the action is performed.
    -> (backend -> m after)
    -- ^ Run this action immediately after the action is completed.
    -> (backend -> UE.SomeException -> m onException)
    -- ^ This action is performed when an exception is received. The
    -- exception is provided as a convenience - it is rethrown once this
    -- cleanup function is complete.
    -> m a
runSqlPoolWithHooks r pconn i before after onException =
    runSqlPoolWithExtensibleHooks r pconn i $ SqlPoolHooks
        { alterBackend = pure
        , runBefore = \conn _ -> void $ before conn
        , runAfter = \conn _ -> void $ after conn
        , runOnException = \b _ e -> void $ onException b e
        }

runSqlPoolWithExtensibleHooks
    :: forall backend m a. (MonadUnliftIO m, BackendCompatible SqlBackend backend)
    => ReaderT backend m a
    -> Pool backend
    -> Maybe IsolationLevel
    -> SqlPoolHooks m backend
    -> m a
runSqlPoolWithExtensibleHooks r pconn i SqlPoolHooks{..} =
    withRunInIO $ \runInIO ->
    withResource pconn $ \conn ->
    UE.mask $ \restore -> do
        conn' <- restore $ runInIO $ alterBackend conn
        _ <- restore $ runInIO $ runBefore conn' i
        a <- restore (runInIO (runReaderT r conn'))
            `UE.catchAny` \e -> do
                _ <- restore $ runInIO $ runOnException conn' i e
                UE.throwIO e
        _ <- restore $ runInIO $ runAfter conn' i
        pure a

The SqlPoolHooks determines what the SqlBackend actually does - the connBegin, connCommit, and connRollback are not called, since we call hooks and ignore them:

runSqlPoolNoTransaction
    :: forall backend m a. (MonadUnliftIO m, BackendCompatible SqlBackend backend)
    => ReaderT backend m a -> Pool backend -> Maybe IsolationLevel -> m a
runSqlPoolNoTransaction r pconn i =
    runSqlPoolWithHooks r pconn i (\_ -> pure ()) (\_ -> pure ()) (\_ _ -> pure ())

So, connBegin doesn't get called, and we don't call start transaction.

With autocommit = 0, I'd expect this behavior to trigger:

If a session that has autocommit disabled ends without explicitly committing the final transaction, MySQL rolls back that transaction.

This does seem a bit surprising to me.

Postgres has a default of autocommit being disabled: https://www.postgresql.org/docs/current/ecpg-sql-set-autocommit.html

SQLite has default of autocommit being enabled: https://www.sqlite.org/c3ref/get_autocommit.html

MySQL's default is for autocommit to be enabled.

🤔

This is definitely an infelicity with how the database backends act, and I'd be in favor of resolving it. However, doing so is necessarily going to be a breaking change in behavior, so I want to communicate that in a way that forces folks to deal with it.

For runSqlPool users, they won't notice any change with the default being either on or off, since each database backend will begin a transaction as part of connBegin.

For runSqlPoolNoTransaction users, right now you need to manually COMMIT; on persistent-mysql and persistent-postgresql. But persistent-sqlite does not require a manual COMMIT; - it commits everything as the action happens.

I think the question is: what behavior do folks want from runSqlPoolNoTransaction? Do we want "you must manually commit results, or they never happen?" Or do they want "run each action in it's own transaction"?

fumieval commented 1 year ago

I think the question is: what behavior do folks want from runSqlPoolNoTransaction? Do we want "you must manually commit results, or they never happen?" Or do they want "run each action in it's own transaction"?

I expected the latter, but apparently this example uses runSqlPoolNoTransaction in "you must manually commit results" way; so turning on autocommit on persistent-postgresql probably isn't an option.

https://github.com/input-output-hk/voting-tools/blob/d14c5a456dc440b1dfe7901b6f19e53e8aadea12/test/integration/Test/Cardano/Catalyst/Helpers.hs

Perhaps we could add a warning to runSqlPoolNoTransaction telling the user that the behaviour vary depending on the backend (it might uncover some nasty bugs). For those who know what they are doing, we can export unsafeRunSqlPoolNoTransaction. What do you think?