tomjaguarpaw / haskell-opaleye

Other
599 stars 115 forks source link

Working read/write-only fields by truly unhooking them #591

Open tysonzero opened 1 month ago

tysonzero commented 1 month ago

As mentioned in the documentation the current readOnlyTableField doesn't work in typical cases due to it feeding DEFAULT into updates. This makes sense as there is no single value/expression/keyword you can feed into both inserts and updates that correctly models the concept of being "read only", the correct way to do it is to omit any mention of the field entirely.

As a side note readOnlyTableField does actually work correctly for generated columns due to them allowing the keyword DEFAULT for inserts and updates even though all other values are rejected due to their read-only nature.

Given the above, we should have tableField functions for all possible combinations. For reads the choices are simply hooked up vs not hooked up. For writes the choices are hooked up with no default, hooked up with default, and not hooked up. You could technically include "optionally not hooked up" type of stuff, but it's not strictly needed as you can always feed in Nothing for inserts and the existing value for updates to get the same behavior.

This gives us the following 6 combinations:

requiredTableField :: String -> TableFields (Field_ n a) (Field_ n a)
optionalTableField :: String -> TableFields (Maybe (Field_ n a)) (Field_ n a)
readOnlyTableField :: String -> TableFields () (Field_ n a)
writeOnlyRequiredTableField :: String -> TableFields (Field_ n a) ()
writeOnlyOptionalTableField :: String -> TableFields (Maybe (Field_ n a)) ()
missingTableField :: TableFields () ()

readOnlyTableField and missingTableField should emit zero code when writing, so no DEFAULT or anything like that, they should just be omitted entirely. Likewise the bottom three should all emit zero code when reading. I skipped the inferable tableField but it would have a second equivalent writeOnlyTableField.

For my use cases I see the most benefit from readOnlyTableField for generated columns and one-time-default-write columns like created and id, and from missingTableField for reserving space on core fully-polymorphic types for data that can't be obtained in a single query, be it due to a one-to-many or data from an external system.

Example read-only column code:

CREATE TABLE "email" (
    "id" uuid NOT NULL DEFAULT gen_random_uuid(),
    "localpart" text NOT NULL,
    "domain" citext NOT NULL,
    "address" text GENERATED ALWAYS AS ("localpart" || '@' || "domain") STORED,
     PRIMARY KEY ("id")
);

data Email a b c d = Email
    { id :: a
    , localpart :: b
    , domain :: c
    , address : d
    }

$(makeAdaptorAndInstance "pEmail" ''Email)

type EmailFieldWrite = Email
    ()
    (Field SqlText)
    (Field SqlCitext)
    ()

type EmailFieldRead = Email
    (Field SqlUuid)
    (Field SqlText)
    (Field SqlCitext)
    (Field SqlText)

emailTable :: Table EmailFieldWrite EmailFieldRead
emailTable = table "email" . pEmail $ Email
    { id = readOnlyTableField "id"
    , localpart = requiredTableField "localpart"
    , domain = requiredTableField "domain"
    , address = readOnlyTableField "address"
    }

Example missing column code:

data Post a b c d e = Post
    { id :: a
    , title :: b
    , description :: c
    , created :: d
    , comments :: e
    }
$(makeAdaptorAndInstance "pPost" ''Post)

data Comment ...
$(makeAdaptorAndInstance "pComment" ''Comment)

type PostWrite = Post
    ()
    (Field SqlText)
    (Field SqlText)
    ()
    ()

type PostRead = Post
    (Field SqlUuid)
    (Field SqlText)
    (Field SqlText)
    (Field SqlTimestamptz)
    ()

postTable :: Table PostWrite PostRead
postTable = table "post" . pPost $ Post
    { id = readOnlyTableField "id"
    , name = requiredTableField "title"
    , description = requiredTableField "body"
    , created = readOnlyTableField "created"
    , comments = missingTableField
    }

type PostObject = Post
    UUID
    Text
    Text
    UTCTime
    [CommentObject]

For additional context see #447 and #590

tomjaguarpaw commented 1 month ago

Just to check I understand, do the writing parts of these behave as indicated in the following?

-- Generates "field = value"
requiredTableField :: String -> TableFields (Field_ n a) (Field_ n a)
writeOnlyRequiredTableField :: String -> TableFields (Field_ n a) ()

-- Just: Generates "field = value"
-- Nothing: Generates "field = DEFAULT"
optionalTableField :: String -> TableFields (Maybe (Field_ n a)) (Field_ n a)
writeOnlyOptionalTableField :: String -> TableFields (Maybe (Field_ n a)) ()

-- Generates nothing for "field"
readOnlyTableField :: String -> TableFields () (Field_ n a)
missingTableField :: TableFields () ()
tysonzero commented 1 month ago

Correct!

tomjaguarpaw commented 1 month ago

Great, in that case maybe we can do them all as versions of a single "write type" that is something like

WriteField a = Omitted | SetToDefault | SetToValue a
tomjaguarpaw commented 1 month ago

I just noticed a related issue on rel8: https://github.com/circuithub/rel8/issues/193

tysonzero commented 1 month ago

Yeah I think many Haskell db libraries handle this type of stuff poorly. I don't think persistent handles default/generated columns well either.

evertedsphere commented 1 month ago

As a side note readOnlyTableField does actually work correctly for generated columns due to them allowing the keyword DEFAULT for inserts and updates even though all other values are rejected due to their read-only nature.

For updates, Opaleye generates a syntactically valid query that will actually "typecheck" against the schema, so to speak, but unless the DEFAULT value is a constant, the semantics of that query is likely not what you want:

db=# create temp table t(
  id integer primary key generated always as identity,
  created_at timestamptz not null default now(),
  v integer not null
);
CREATE TABLE

db=# insert into t(v) values(1);
INSERT 0 1

db=# select * from t;
 id |          created_at           | v
----+-------------------------------+---
  1 | 2024-05-20 08:13:51.872027+00 | 1

db=# update t set v = 2;
UPDATE 1

db=# select * from t;
 id |          created_at           | v
----+-------------------------------+---
  1 | 2024-05-20 08:13:51.872027+00 | 2

db=# update t set v = 3, id = default, created_at = default;
UPDATE 1

db=# select * from t;
 id |          created_at           | v
----+-------------------------------+---
  2 | 2024-05-20 08:14:21.103436+00 | 3

The query updates the field by recomputing the DEFAULT expression. In particular, this means that a table with a DEFAULTed foreign key target cannot be updated in this manner (since that would violate referential integrity).

(Which is just to say that the "nothing generated, not even DEFAULT" case is vital.)

evertedsphere commented 1 month ago

Ah, I see this was mentioned in https://github.com/tomjaguarpaw/haskell-opaleye/issues/590#issuecomment-2097667974 and you were talking about generated columns, not the…serial thing.

tysonzero commented 1 month ago

Yes just GENERATED ALWAYS AS ( generation_expr ) STORED