dylex / postgresql-typed

Haskell PostgreSQL library with compile-time type inference
http://hackage.haskell.org/package/postgresql-typed
Other
84 stars 12 forks source link

Associating Haskell fields with table columns #3

Open joneshf opened 7 years ago

joneshf commented 7 years ago

First, this is an amazing project! Thanks so much for maintaining it.

Something I get nervous about is accidentally putting the right type of Haskell data into the wrong column in the database, or getting the right type of data from the wrong column in the database. For instance, assuming a table like:

CREATE TABLE foo (
    bar character varying NOT NULL,
    baz character varying NOT NULL
);

And some Haskell definitions:

newtype Bar
  = Bar Text

instance PGColumn "character varying" Bar where
  ...

instance PGParameter "character varying" Bar where
  ...

newtype Baz
  = Baz Text

instance PGColumn "character varying" Baz where
  ...

instance PGParameter "character varying" Baz where
  ...

data Foo
  = Foo
    { bar :: Bar
    , baz :: Baz
    }

There's nothing that really stops me from writing a query where the fields are mixed up, like:

getFoos conn = do
  foos <- pgQuery conn [pgSQL| SELECT bar, baz FROM foo |]
  pure $ go <$> foos
  where
  go :: (Baz, Bar) -> Foo
  go (baz, bar) = Foo {..}

Everything type checks on the Haskell side, and it's a valid SQL expression. But, I've asked for the values from foo.bar and foo.baz–in that order–in SQL. Since both are varchars, the instances can be found in either order on the Haskell side. And a small typo has led to the data being wrong.

Do you have any thoughts on how to mitigate the chance for mistakes here? My first thought was to somehow associate the types in Foo with the column names in the table foo. Is that feasible? I would imagine that you'd need an additional type class and a different quasi quoter in order to handle that. But I haven't actually looked at anything yet. That might also be more complex than need be.

dylex commented 7 years ago

I'll think more about this, but a couple quick thoughts:

For query result values, postgres does provide column names (the same ones you see is psql headers -- they don't have table names and can be changed by AS). It does seem theoretically possible to create a newtype wrapper indexed by an additional column name type literal that only accepts values from columns of that name. This would take a little extra to implement and need some way to turn on, but I think it's doable. However, there is no equivalent for query paremeters, so this would only provide a layer of safety coming out of the database, not going in.

Another approach could be to do the equivalent of creating newtypes for columns you want to distinguish in the database itself using CREATE TYPE. You could use this to create a new type equivalent to another one but with a different name/OID (by passing all the same input/output functions). This would be a bit bulky, though, and requires extra steps to handle any columns you want to distinguish this way. (There may also be a way to do it with a singleton row type wrapper.)

The more traditional way to go with this, of course, is an ORM layer that constructs queries for you and makes sure your column references match. This is certainly something that could be built on top of this library, but feels a bit antithetical to the goal of typechecking arbitrary queries.

dylex commented 7 years ago

Oh, one more option along the lines of an ORM: I've been playing with this new Database.PostgreSQL.Typed.Relation module that automatically creates data types corresponding to entire tables. In addition, it provides marshalling for row types of that table which will guarantee your fields match up. This has a few limitations: you have to deal with the entire table at once (all the columns); the SQL syntax is ugly (SELECT foo.*::foo FROM foo); there is some postgres performance overhead from the row construction. I'm working on improving at least some of these things, but for now it seems too cumbersome to be useful, though it would solve your problem.

joneshf commented 7 years ago

Thanks for the quick reply! I didn't know you could define additional types in sql. That might be the most straight forward way. I'll do some reading. Thanks for the hint!