alevy / postgresql-orm

An Haskell ORM (Object Relational Mapping) and migrations DSL for PostgreSQL.
http://simple.cx
GNU General Public License v3.0
78 stars 12 forks source link

modify model before save #2

Open alevy opened 10 years ago

alevy commented 10 years ago

@nd2s wrote (alevy/simple.cx#2):

Is there a way to easily modify the model before INSERT/UPDATE? I need to set created_at and update_at timestamps on every save.

alevy commented 10 years ago

The only callback on a save is for validations --- it is pure and doesn't return a Model so won't help you in this case.

There are three ways I can think of to address this issue:

  1. You could use postgresql triggers on the table to have postgresql update the created_at column on INSERT and updated_at on UPDATE. You would need to customize your model instance to make sure save doesn't explicitly overwrite those columns with the current value. I think that would be pretty gnarly right now (I think you'd overwrite modelQWriteColumns in the Model's modelIdentifiers field), but there's no good reason it's gnarly other than it hasn't come up yet -- we can easily add helpers and/or a combinator for Generic instances to make that pretty straigh forward. If you happen to go this route, i'd be happy to help and am very inclined to add stuff to the library to make this easy.
  2. We could support callbacks. I've thought of this before and have gone back and forth on whether it's worth it. I'm reluctant to allow any custom side effects in the pathway of save since the contract with the caller should really be that only a single query is executed. However, without side-effects, you may not be able to do anything interesting other than validation. Maybe some of this can be achieved by just letting validate return a transformed model? See the aside below for a way to solve the particular timestamp use case without side-effectful callbacks.
  3. You could have a custom saveMyModel function for your model, which just modifies those fields in the approriate way and otherwise wraps save. If you find yourself doing this for many of your models, creating a typeclass is your friend, in fact, this is probably the best way to get the callbacks functionality in general without support from the library yet.
class Model a => TimestampModel a where
    setCreatedAt :: a -> UTCTime -> a
    setUpdatedAt :: a -> UTCTime -> a

saveTimestamp :: TimestampModel a => Connection -> a -> IO ()
...

Aside

This doesn't address your problem directly (more on that below), but I've taken to using a wrapper type for timestamps that looks something like:

data DBTime = Now | DBTime { dbTime :: UTCTime }

instance FromField DBTime where
  fromField = DBTime <$> fromField

instance ToField DBTime where
  toField Now = Plan "now()"
  toField (DBTime t) = toField t

In summary, this type let's you stick a Now in your mymodelCreatedAt field when you first construct it and leaves determining the actual time for the database (which is good because you (a) can do it in pure code, and (b) avoids having to worry about app server clocks being synchronized). I want to add this type but haven't settled on an interface yet, so feedback welcome...

nd2s commented 10 years ago

Thank you for the quick and detailed reply!

  1. Yes, triggers would be a clean and safe way of handling data modifications. It would be pretty good if there were be a way to say in modelInfo what fields should be updated after successful save (using postgres' RETURNING keyword to get back data). Would help to use database generated identifiers of all kind (like unguessable youtube-like ids).
  2. Pure code does not really help. Only use case I can think of is parsing/calculating some values after successful SELECT. But that can probably better done using custom data types most of the time. For other real-world use cases like updating or invalidating a cache, side effects are a must.
  3. Looks like the best way for side effect "contaminated" callbacks. But with that approach it is possible to accidently use save instead of saveTimestamp - and compiler wouldn't complain.

A special DBTime type would be pretty useful for my particular usecase. I can't really give you feedback on that, though - just using Haskell for the first time on a more-serious project to get into it.

I'm using a special "nullTime" (UTCTime set to 1970 in my case) in my code right now (as default value for "locked_at" timestamp of users relation). Maybe it would make sense to add something similar as default value.

Concluding I'd say 1. would be the best solution for handling these kind of data changes. Updating timestamps should be done in db anyway. No idea about handling more general callbacks.

saurabhnanda commented 6 years ago

Chiming in for this. Also, sometimes validations also need to be non-pure.

alevy commented 6 years ago

Also, sometimes validations also need to be non-pure.

That's definitely functionallty true, but for the interface that postgresql-orm provides, we need to balance common cases with a desire to avoid the unpredictable performance that results from allowing IO in callbacks. One of the motivations was that Rails validations tended (for us anyway) to end up generating a bunch of extra DB queries that were hidden and resulted in really hard to debug performance issues.

So the current thinking is --- do stateful validation in application logic, or a wrapper library. Though, of course, that's just the justification for the existing API, and that thinking is debatable and could be changed.

alevy commented 6 years ago

Side note (since I don't know where else to put it): thanks for reviving these various issues. Our current use of this library has mostly stabilized so there hasn't been much urgency in improving it beyond the occasional bug fix. But if there's interest in using it, I more than welcome contributions and improvements! Thanks!