Help with writing a code-generator for Opaleye

saurabhnanda commented 6 years ago

I'm half-way through writing a code-generation script [1] and have reached the point where I need to hook-up my record-types to Opaleye. We've tried the recommended approach of polymorphic records in the past and aren't huge fans of it for the following reasons:

More type annotations are required
Error messages become worse

I'm investigating what it would take to go back to plain-old monomorphic records. Given that context, I'm referring to the monomorphic tutorial and am wondering:

What do the following actually do: Default Unpackspec BirthdayColumn BirthdayColumn and Default Opaleye.Internal.TableMaker.ColumnMaker BirthdayColumn BirthdayColumn and Default O.QueryRunner BirthdayColumn Birthday? (Actually the last one is easy to understand. It's the first two which are absolutely new to me)
All three instances seem to use the exact same definition for def. Why can't just one be defined, and the other two simply reuse the first one?
Under what circumstances will the instance-definitions of these type-classes be different for a given record type?
Is it possible to have default generic implementations for them for the most common out-of-the-box use-case?

[1] Not TemplateHaskell, but actually generating Haskell source files

tomjaguarpaw commented 6 years ago

I'm half-way through writing a code-generation script

Great! I'll be really interested to see how this turns out.

What do the following actually do

I think you already know what Default p BirthdayColumn BirthdayColumn does :) It makes the default value of p BirthdayColumn BirthdayColumn. What I think you mean is "what do each of the product profunctors do"?

Unpackspec

There's documentation for this. Is it insufficient?

https://hackage.haskell.org/package/opaleye-0.5.4.0/docs/Opaleye-Internal-Unpackspec.html

ColumnMaker

ColumnMaker allows you to turn a Table into a Query. However, on closer investigation it became clear to me that it was identical to Unpackspec so I removed it.

QueryRunner

As you say, you already know what this does.

All three instances seem to use the exact same definition for def. Why can't just one be defined, and the other two simply reuse the first one?

That's a nice observation. I merged two of the definitions for you:

https://github.com/tomjaguarpaw/haskell-opaleye/blob/61e0a81455238614b5b3246b95de86d0608458b3/Doc/Tutorial/TutorialBasicMonomorphic.lhs#L134

In the next version of Opaleye ColumnMaker will be gone but you can use the new code I wrote to make Default Binaryspec, etc., instances.

The other one can't use the same implementation. It has the same structure but the types are different. The whole point of using polymorphic records is to make the types the same so that a single implementation could be used!

Under what circumstances will the instance-definitions of these type-classes be different for a given record type?

They won't be different. Under all reasonable circumstances (the structure of) the implementations will be exactly the same. But because the types differ you have to write out different instances with the same structure. Again, the whole point of using polymorphic records is to make the types the same so that a single implementation could be used!

Is it possible to have default generic implementations for them for the most common out-of-the-box use-case?

I don't know what you mean here. If you're asking whether it's technically possible to write

birthdayColumnDef ::
   (Applicative (p BirthdayColumn),
    P.Profunctor p,
    Default p (Column PGText) (Column PGText),
    Default p (Column PGDate) (Column PGDate)) =>
   p BirthdayColumn BirthdayColumn
 birthdayColumnDef = BirthdayColumn <$> P.lmap bdNameColumn D.def
                                    <*> P.lmap bdDayColumn  D.def

 instance Default Unpackspec BirthdayColumn BirthdayColumn where
   def = birthdayColumnDef

 instance Default Opaleye.Internal.TableMaker.ColumnMaker BirthdayColumn BirthdayColumn where
   def = birthdayColumnDef

using Generic, then the answer is "yes". But no one's bothered to implement that. Perhaps you'd like to? Implementing

instance Default O.QueryRunner BirthdayColumn Birthday where
  def = Birthday <$> P.lmap bdNameColumn D.def
                 <*> P.lmap bdDayColumn  D.def
runBirthdayQuery :: PGS.Connection
                 -> Query BirthdayColumn
                 -> IO [Birthday]
runBirthdayQuery = runQuery

using Generic is probably also possible but a bit trickier.

I hope that helps. Please let me know how you get on with this.

tomjaguarpaw commented 6 years ago

I realised that it's possible to share instance declarations by using UndecidableInstances. What do you think?

https://github.com/tomjaguarpaw/haskell-opaleye/blob/1705ebcc6c3f1b82c230221618f165c0470711d0/Doc/Tutorial/TutorialBasicMonomorphic.lhs#L131

tomjaguarpaw commented 6 years ago

(As an aside, the title of the issue "Using Opaleye without polymorphic records and profunctors?" is very misleading. Whereas it's possible to use Opaleye without polymorphic records, it's not possible to use Opaleye without profunctors, nor are you attempting to.)

saurabhnanda commented 6 years ago

There's documentation for this. Is it insufficient? https://hackage.haskell.org/package/opaleye-0.5.4.0/docs/Opaleye-Internal-Unpackspec.html

The first confusion was how Unpackspec is different from ColumnMaker. Now the new confusion is how Unpackspec is different from DefaultQueryRunner. Another way to attack the question is, under what circumstances would one want to do the following:

For example, the Default instance of type Unpackspec (Column a, Column b) (Column a, Column b) allows you to manipulate or extract the two PrimExprs inside a (Column a, Column b).

Moving to the next point...

They won't be different. Under all reasonable circumstances (the structure of) the implementations will be exactly the same. But because the types differ you have to write out different instances with the same structure. Again, the whole point of using polymorphic records is to make the types the same so that a single implementation could be used!

If under all reasonable circumstances, the structure is going to be the same, i.e...

DataConstructor
  <$> P.lmap fieldAccessor1 [D.def | [ required "colname1" | optional "colname1" ]]
  <*> P.lmap fieldAccessor2 [D.def | [ required "colname2" | optional "colname2" ]]
  <*> P.lmap fieldAccessor3 [D.def | [ required "colname3" | optional "colname3" ]]
  ....

... it should be theoretically possible to use GHC Generics to get this done, right?

I realised that it's possible to share instance declarations by using UndecidableInstances. What do you think?

I'm not aware of the complete impact/side-effects of using UndecidableInstances. My line-of-questioning is more around why, both, ColumnMaker AND Unpackspec, are needed in the first place. But, I think you've already decided to change that in future version of Opaleye, right? (Thank you for that!) Right now, defining a common function and using it in both instances is acceptable for what I'm doing (it seems safer than using UndecidableInstances for no reason other than "the fear of the unknown").

I'd also like to deeply understand why, both, Unpacksec AND DefaultQueryRunner, are needed? I keep running into different variants of this same question every now and then.

(As an aside, the title of the issue "Using Opaleye without polymorphic records and profunctors?" is very misleading. Whereas it's possible to use Opaleye without polymorphic records, it's not possible to use Opaleye without profunctors, nor are you attempting to.)

You are correct. I'm renaming this thread to "Help with writing code-generator for Opaleye".

Also, I've been staring long and hard at this usage of makeAdaptorAndInstances and pWidget and am wondering:

Is pWidget required ONLY to define the Table? There is no other use-case for users of Opaleye to bother with Product Profunctors, right? (especially if this problem can be solved by GHC Generics OR code-gen somehow)
Can the required/optional annotations be figured out from the Maybe/Nullable annotations in PGRead and PGWrite? Why are BOTH of them (i.e. Maybe/Nullable AND required/optional) needed in the first place?

And now for a new question:

Do the number of fields in a PGRead and PGWrite need to be the same? There's a common pattern that we have identified where the row-creation (and probably row-updation) interface ALWAYS has three fields/columns lesser than the read interface, name: id, created_at, and updated_at.

Is it possible to bake this right into the Opaleye Table data structure, for example:

data Tag = Tag 
  {
    _tagId :: TagId
  , _tagClientId :: Integer
  , _tagName :: Text
  , _tagColourCode :: Text
  , _tagCreatedAt :: UTCTime
  , _tagUpdatedAt :: UTCTime} 
  deriving (Eq, Show, Generic)

data TagPGWrite = TagPGWrite
  {
    _pgwClientId :: Column PGInt4
  , _pgwName :: Column PGText
  , _pgwColourCode :: Column PGText
  }

data TagPGRead = TagPGRead
  {
    _pgrId :: Column PGInt4
  , _pgrCreatedAt :: Column PGTimestamp
  , _pgrUpdatedAt :: Column PGTimestamp
  , _pgrClientId :: Column PGInt4
  , _pgrName :: Column PGText
  , _pgrColourCode :: Column PGText
  }

What would the Unpackspec/ColumnaMaker, DefaultQueryRunner and Table definitions for something like the above, look like?

saurabhnanda commented 6 years ago

Here's what I tried, and I'm running into an error. Obviously I don't understand what's really going on here :)

tagTable :: Table TagPGWrite TagPGRead
tagTable = Table
           "tags"
           (TagPGRead
             <$> P.lmap _pgrId (required "id")
             <*> P.lmap _pgrCreatedAt (required "created_at")
             <*> P.lmap _pgrUpdatedAt (required "updated_at")
             <*> P.lmap _pgrClientId (required "client_id")
             <*> P.lmap _pgrName (required "name")
             <*> P.lmap _pgrColourCode (required "colour_code")
           )

Error:

    53  12 error           error:
     • Couldn't match type ‘TagPGRead’ with ‘TagPGWrite’
       Expected type: Table TagPGWrite TagPGRead
         Actual type: Table TagPGRead TagPGRead
     • In the expression:
         Table
           "tags"
           (TagPGRead <$> P.lmap _pgrId (required "id")
            <*> P.lmap _pgrCreatedAt (required "created_at")
            <*> P.lmap _pgrUpdatedAt (required "updated_at")
            <*> P.lmap _pgrClientId (required "client_id")
            <*> P.lmap _pgrName (required "name")
            <*> P.lmap _pgrColourCode (required "colour_code"))
       In an equation for ‘tagTable’:
           tagTable
             = Table
                 "tags"
                 (TagPGRead <$> P.lmap _pgrId (required "id")
                  <*> P.lmap _pgrCreatedAt (required "created_at")
                  <*> P.lmap _pgrUpdatedAt (required "updated_at")
                  <*> P.lmap _pgrClientId (required "client_id")
                  <*> P.lmap _pgrName (required "name")
                  <*> P.lmap _pgrColourCode (required "colour_code")) (intero)

saurabhnanda commented 6 years ago

I noticed that in the monomorphic tutorial we don't have separate types for PGRead and PGWrite. Is this just missing from the tutorial or is this currently not possible with monomorphic records?

saurabhnanda commented 6 years ago

I'm trying to peel another layer of the onion and am not sure why the first snippet type-checks, but the second doesn't:

-- THIS TYPECHECKS
tagTable :: Table TagPGWrite TagPGRead
tagTable = Table "tags" $
           TableProperties{tablePropertiesWriter=undefined, tablePropertiesView=undefined}

-- THIS DOESN'T TYPECHECK
tagTable :: Table TagPGWrite TagPGRead
tagTable = Table "tags" $
           TableProperties{tablePropertiesWriter=undefined, tablePropertiesView=(View TagPGRead)}

Error for the second snippet:

    71  12 error           error:
     • Couldn't match type ‘Column PGInt4
                            -> Column PGTimestamp
                            -> Column PGTimestamp
                            -> Column PGInt4
                            -> Column PGText
                            -> Column PGText
                            -> TagPGRead’
                      with ‘TagPGRead’

tomjaguarpaw commented 6 years ago

The first confusion was how Unpackspec is different from ColumnMaker

They were the same by accident. I've now merged them. (I realise you know this. I'm just stating it again for completeness.)

how Unpackspec is different from DefaultQueryRunner

(You mean "QueryRunner". There's no such thing as "DefaultQueryRunner". There's Default QueryRunner a b, but that's a constraint, which is a different kind of thing altogether.)

Anyway, Unpackspec allows the implementation of Opaleye to iterate over the columns in a datatype and do stuff internally in the AST. QueryRunner provides parsers for the Postgres serialisation format so that you can decode query responses that come back from the server.

under what circumstances would one want to do the following ...

If you were writing the internals of a database library.

If under all reasonable circumstances, the structure is going to be the same ... it should be theoretically possible to use GHC Generics to get this done, right?

Yes, absolutely. It's just that no one has done it yet. And even if the structure was different you could use GHC Generics to get it done!

I'm not aware of the complete impact/side-effects of using UndecidableInstances

Fair enough. Nor I. I think that I've found an even better way now, anyway.

My line-of-questioning is more around why, both, ColumnMaker AND Unpackspec, are needed in the first place. But, I think you've already decided to change that in future version of Opaleye, right?

Yes indeed. They were identical. I'm not sure why. Maybe they started off different.

Right now, defining a common function and using it in both instances is acceptable for what I'm doing

In a version of Opaleye in the near future you won't even have to write two instances! Stay tuned ... (or see https://github.com/tomjaguarpaw/haskell-opaleye/tree/packmapcolumn for a sneak preview).

I'd also like to deeply understand why, both, Unpacksec AND DefaultQueryRunner, are needed? I keep running into different variants of this same question every now and then.

(You mean QueryRunner). Well, they do completely different things. One allows you to manipulate the Columns inside a container. The other one contains a RowParser allowing you to decode the results coming back from the server. This is really foundational stuff so if you still don't get it then please come back to me.

Is pWidget required ONLY to define the Table?

It's not required to define the table, as you can see from the monomorphic tutorial in which it is not used for the widget table.

There is no other use-case for users of Opaleye to bother with Product Profunctors, right?

There's another use case for pWidget, which is aggregation. But you don't need it there either. See again the monomorphic tutorial.

But what do you mean by "bother with Product Profunctors"? Something has got to generate the instances needed, either with Generic or with TH, and it may as well be the product-profunctors library. The whole point of the product-profunctors library is to generate and manipulate these instances.

(especially if this problem can be solved by GHC Generics OR code-gen somehow)

Sure, but where will you put the Generic or code-gen code? It may as well go in product-profunctors.

Can the required/optional annotations be figured out from the Maybe/Nullable annotations in PGRead and PGWrite? Why are BOTH of them (i.e. Maybe/Nullable AND required/optional) needed in the first place?

They're not both needed. In principle you can infer the required/optional but no one's ever implemented that in Opaleye. Something like this should do the trick: https://github.com/tomjaguarpaw/haskell-opaleye/commit/18f7357e2aaf7629ec6dc7060c80baf7c8d6b8e0

Do the number of fields in a PGRead and PGWrite need to be the same?

No

Is it possible to bake this right into the Opaleye Table data structure

Sort of. There's a bit of a problem with that because Nothing in a write type means "autogenerate a new value". This is what you want for inserts but it's not what you want for updates. I think we should put this one on the backburner and get all the other issues sorted out first. Then we can come back to this issue. I'm not convinced that removing id, createdAt and updatedAt from the write type is the correct way to handle this anyway.

I noticed that in the monomorphic tutorial we don't have separate types for PGRead and PGWrite. Is this just missing from the tutorial or is this currently not possible with monomorphic records?

It's possible with monomorphic types but it is omitted from the basic tutorial for simplicity. The basic tutorial doesn't treat writes at all.

I'm trying to peel another layer of the onion and am not sure why the first snippet type-checks, but the second doesn't

What are you trying to do here? This doesn't look like it makes any sense at all and is definitely not a suported way of creating a table! I think your question is moot, but to answer it directly, the first one typechecks because the type of the thing you give to View must have type TagPGRead. undefined does (because it is of type forall a. a) but TagPGRead doesn't (because it is of the type given in the error message).

saurabhnanda commented 6 years ago

Let me paste what my code-gen is doing with polymorphic records first. The idea I'm churning in my head is whether it would make things easier for everyone in our team if the core DB records weren't polymorphic, but monomorphic.

I'm trying to peel another layer of the onion and am not sure why the first snippet type-checks, but the second doesn't

What are you trying to do here?

I was trying to define a Table with different number of fields in PGW and PGR.

tomjaguarpaw commented 6 years ago

Let me paste what my code-gen is doing with polymorphic records first. The idea I'm churning in my head is whether it would make things easier for everyone in our team if the core DB records weren't polymorphic, but monomorphic.

OK sure. If you can find out a good way of working with monomorphic records then that would be good for everyone using Opaleye!

I was trying to define a Table with different number of fields in PGW and PGR.

You definitely don't need the internals to do that. You should be able to do everything using <$>, <*>, dimap.

saurabhnanda commented 6 years ago

The output from my code-generator is available at https://gist.github.com/saurabhnanda/7866caf670c8af884fe7286fe28020d7

I need to now hook this up with Opaleye by creating the Table values and implementing the various typeclasses. Right now I'm using polymorphic records, because I don't know how to use monomorphic records properly with Opaleye, especially if the PGW and PGR types are different.

I'd like to experiment with using monomorphic records, because the compiler errors are much easier to read. With polymorphic records if you get the type of one of the fields incorrect, it is almost impossible to figure out which one, based on the error message.

The other thing I'm thinking about is what should be the write/update interface for these auto-generated models. In every model there will be some fields that should never be touched by the Haskell layer, they will always be filled/updated by the DB. In our case almost every table/model has id, created_at, updated_at.

saurabhnanda commented 6 years ago

@tomjaguarpaw is there any way to have PGW and PGR records with different number of fields (without using the () types trick with polymorphic records)?

saurabhnanda commented 6 years ago

I went with the regular polymorphic records, for now. I'm now writing a bunch of helper functions & type-classes to make basic CRUD operations easier. Does the following seem like a good idea? Can you spot any obvious deficiencies in this design?

filterAndSelect
  :: (HasDatabase m,
      Default Unpackspec columns columns,
      Default QueryRunner columns haskells,
      Default ColumnMaker t t)
  => Table a t
  -> (t -> Column PGBool)
  -> (t -> columns)
  -> m [haskells]
filterAndSelect table filterFn columnSelector = Foundation.runQuery $ proc () -> do
  r <- queryTable table -< ()
  restrict -< (filterFn r)
  returnA -< (columnSelector r)

filterTable
  :: ( Default ColumnMaker columns columns
     , Default QueryRunner columns haskells
     , Default Unpackspec columns columns, HasDatabase m )
  => Table a columns -> (columns -> Column PGBool) -> m [haskells]
filterTable table filterFn = filterAndSelect table filterFn Prelude.id

class
  (
    Default QueryRunner pgr h
  , Default Constant h pgw
  , Default Constant pk (Column pk)
  , QueryRunnerColumnDefault pk pk
  , HasId h pk
  , HasId pgw (Maybe (Column pk))
  , HasId pgr (Column pk)
  , Default Unpackspec pgr pgr
  , Default ColumnMaker pgr pgr
  , Default Opaleye.Updater.Updater pgr pgw
  ) => DbModel h pk pgw pgr | h -> pk, pk -> h, h -> pgw, h -> pgr where

  tableFor :: h -> Table pgw pgr

  createModel :: (HasDatabase m) => h -> m h
  createModel x = Foundation.runInsertManyReturning (tableFor x) [constant x] >>= \case
    [r] -> pure r

  findByPk :: (HasDatabase m) => pk -> m h
  findByPk pkVal = filterTable (tableFor (undefined :: h)) (\r -> (r ^. id) .== (constant pkVal)) >>= \case
    [r] -> pure r

  deleteModel :: (HasDatabase m) => h -> m ()
  deleteModel x = void $ Foundation.runDelete (tableFor (undefined :: h)) (\r -> (r ^. id) .== (constant $ x ^. id))

  deleteModelByPk :: (HasDatabase m) => pk -> m ()
  deleteModelByPk pkVal = void $ Foundation.runDelete (tableFor (undefined :: h)) (\r -> (r ^. id) .== (constant pkVal))

  updateModel :: (HasDatabase m) => pk -> (pgr -> pgr) -> m h
  updateModel pkVal up = Foundation.runUpdateReturning (tableFor (undefined :: h)) up (\r -> (r ^. id) .== (constant pkVal)) (Prelude.id) >>= \case
    [r] -> pure r

tomjaguarpaw commented 3 years ago

Closing as stale. Feel free to reopen if necessary.

tomjaguarpaw / haskell-opaleye

Help with writing a code-generator for Opaleye #323