yesodweb / persistent

Persistence interface for Haskell allowing multiple storage methods.
MIT License
467 stars 297 forks source link

Unify approaches to polymorphism #1302

Open parsonsmatt opened 3 years ago

parsonsmatt commented 3 years ago

Right now, persistent has a somewhat confusing array of techniques for providing polymorphism.

The intent is to allow queries and database actions to be backend agnostic.

f 
    :: (MonadIO m, PersistStoreRead backend) 
    => ReaderT backend m a

This function works for any backend that implements the PersistStoreRead class.

That class has this (simplified) definition:

class PersistStoreRead backend where
    get 
        :: ( MonadIO m
            , PersistEntity record
            , PersistEntityBackend record ~ backend
            )
        => Key entity
        -> ReaderT backend m (Maybe record)

This works pretty well - basically every possible persistent backend can support this.

However, we come to a problem with upsert. This is not natively handled by all backends, so we provide somewhat dumb fallbacks, and allow instances to provide better behavior.

Simplifying a bit, we have:

class (PersistStore backend) => PersistUnique backend where
    upsertBy 
        :: (MonadIO m, PersistRecordBackend record backend)
        => Unique record
        -> record
        -> [Update record]
        -> ReaderT backend m (Entity record)
    upsertBy = defaultUpsertBy 

defaultUpsertBy performs two database actions, while an efficient override might be able to do it in a single database action.

So how does SqlBackend work? Again, simplifying it a bit, we have:

instance PersistUnique SqlBackend where
    upsertBy uniqueKey record updates = do
        conn <- ask
        case connUpsertSql conn of
            Nothing -> 
                defaultUpsertBy uniqueKey record updates
            Just upsertSql -> do
                -- run the optimized action

So, SqlBackend, as it happens, is not guaranteed to have an efficient implementation of upsert. So we have a record field like:

data SqlBackend = SqlBackend
    { connUpsertSql :: Maybe MkUpsertSql
    }

If we're producing a backend for Postgres, which does have an efficient upsert, then we put a Just connUpsertSqlFunction in the record. If we're producing a backend for MySql (which does not yet support it? idk) then we write Nothing for the field, and we use the default slow implementation.

We've now got two approaches for polymorphism - one is adding Maybe fields to a record, and the other is adding a type class for the relevant operations. This is unsatisfying.

Considerations

We want:

  1. To provide a uniform interface for database access.
  2. To allow specific database backends to provide more efficient implementations of operations.
  3. For people to write programs that can operate against different database backends.

What isn't great:

  1. Lots of different ways to accomplish the same basic thing
  2. Confusion around how this stuff all works
  3. Friction around adding new features in a backwards-compatible way.

The PR #1298 adds a new type class and a new record field to SqlBackend to support streaming rows. By all accounts, it's doing everything right - the existing conventions are followed perfectly.

Alternatives

How else can we do this?

We want for eg MongoContext and SqlBackend to work, and we also want for postgresql and mysql to work for upsert, despite sharing a SqlBackend.

ivb-supercede commented 3 years ago

and we also want for postgresql and mysql to work for upsert, despite sharing a SqlBackend.

Why (beyond backwards compatibility) do we want SqlBackend to be shared across the multiple SQL databases? Could the various SQL DBs not have specific types (e.g. PostgreSQLBackend) to allow for polymorphism per-backend at the type level?

Or is the idea to have as much compatibility between backends as possible to allow for re-use of code? In which case, SqlBackend could be made a BaseBackend of each of the per-DB types.

parsonsmatt commented 3 years ago

Yeah, I think there's some desire to allow folks to write database code that can interop with eg Postgres in prod and Sqlite in dev/test. I don't personally buy the utility of that, but it's not a use case I want to dismiss.

The BaseBackend machinery needs to go and get switched out entirely for the BackendCompatible stuff - it's much more flexible and useful.