agrafix / elm-bridge

Haskell: Derive Elm types from Haskell types
BSD 3-Clause "New" or "Revised" License
100 stars 27 forks source link

Support for simple phantom types #35

Open arbus opened 6 years ago

arbus commented 6 years ago

Currently, if we have a simple phantom type like this in Haskell:

newtype Id a = Id { unId :: Text }

there exists a valid elm equivalent of:

type Id a = Id { unId : String }

However, trying to derive it with deriveElmDef fails with the error message:

Oops, can only derive data and newtype, not this: NewtypeD [] Main.Id [KindedTV a_6989586621679074905 StarT] Nothing (RecC Main.Id [(Main.unId,Bang NoSourceUnpackedness NoSourceStrictness,ConT Data.Text.Internal.Text)]) []

Redefining the initial definition of Id to use data instead of newtype and passing in a dummy type in the DefineElm clause like DefineElm (Proxy @(Id Text)) will produce the following elm code which is invalid:

type alias Id a =
   { unId: String
   }

and fails with the error message:

Type alias `Id` cannot have unused type variables: a

12|>type alias Id a =
13|>   { unId: String
14|>   }

You probably need to change the declaration like this:

type alias Id = ...

What is the current best strategy for sharing simple phantom types across Haskell and Elm?

bartavelle commented 6 years ago

I will look into this as soon as I am back home, but will take a couple weeks. Is this an urgent problem?

arbus commented 6 years ago

Not urgent at all, enjoy your break!

chshersh commented 6 years ago

It's possible to support phantom types with minor dirty patch to the library:

diff --git a/src/Elm/Derive.hs b/src/Elm/Derive.hs
index 336cdce..1288e98 100644
--- a/src/Elm/Derive.hs
+++ b/src/Elm/Derive.hs
@@ -187,7 +187,7 @@ deriveElmDef opts name =
          DataD _ _ tyVars _ constrs _ ->
              case constrs of
                [] -> fail "Can not derive empty data decls"
-               [RecC _ conFields] -> deriveAlias False opts name tyVars conFields
+--               [RecC _ conFields] -> deriveAlias False opts name tyVars conFields
                _ -> deriveSum opts name tyVars constrs
          NewtypeD [] _ [] Nothing (NormalC _ [(Bang NoSourceUnpackedness NoSourceStrictness, otherTy)]) [] ->
             deriveSynonym opts name [] otherTy
@@ -195,6 +195,8 @@ deriveElmDef opts name =
           if A.unwrapUnaryRecords opts
             then deriveSynonym opts name [] otherTy
             else deriveAlias True opts name [] conFields
+         NewtypeD [] _ tyVars _ constr _ ->
+            deriveSum opts name tyVars [constr]
          TySynD _ vars otherTy ->
              deriveSynonym opts name vars otherTy
          _ -> fail ("Oops, can only derive data and newtype, not this: " ++ show tyCon)

But this creates data type for records in all cases (so it breaks backwards compatibility a lot and probably conflicts with author's intention).

With something like this:

newtype Id a = Id { unId :: Int }

data User = User
    { userId   :: Id User
    , userName :: String
    }

exportTypes :: [DefineElm]
exportTypes = [ DefineElm $ Proxy @(Id ())
              , DefineElm $ Proxy @User
              ]

deriveElmDef defaultOptions ''Id
deriveElmDef defaultOptions ''User

I can see the following code for data types is generated:

type Id a =
    Id {unId: Int}

type User  =
    User {userId: (Id User), userName: String}

And it compiles. The change is available here:

bartavelle commented 6 years ago

I am looking a bit into this. It is fairly easy to have the proper type definition, but for serialization code the current infrastructure is a bit lacking. You will get something like that:

type alias PhantomA a = Int
jsonDecPhantomA : Json.Decode.Decoder a -> Json.Decode.Decoder ( PhantomA a )

Where you need to pass a dummy decoder that will not be used. Is that acceptable? If not, it is a bit more work!

arbus commented 6 years ago

The dummy variable in the decoder is not a problem since the generated code is still valid. However, consider the following use case in Haskell:

data Id a = { unId :: Int }

data Foo =
  Foo
  { fooId :: Id Foo
  , fooName :: Text
  }

The generated code from your latest commit(887c2) currently doesn't work because Foo is created as a type alias, not a type:

-- This is the currently generated code, which won't compile since type alias
-- cannot have unused type variables.
type alias Id a = Int
-- Assume that we instead change it to a type like below:
type Id a = Int

-- The generated code for `Foo` as below fails to compile since
-- it will try and expand the definition of `Foo` infinitely.
type alias Foo  =
   { fooId: (Id Foo)
   , fooName: String
   }

The only way I see around this is to detect when a phantom type is used and change the definition of Foo to a type, something like what @ChShersh 's patch does:

type Foo =
    Foo
    { fooId : (Id Foo)
    , fooName : String
    }
bartavelle commented 6 years ago

This will not compile either:

type Id a = Id Int

type alias Foo =
  { fooId : Id Foo
  , fooName : String
  }

Fails with:

This type alias is recursive, forming an infinite type

The only thing that works here is:

type Id a = Id Int

type Foo = Foo
  { fooId : Id Foo
  , fooName : String
  }

Which is kind of bad :/ Adding the Foo on sum types is definitely annoying, and detecting that things are newtypes and doing special things is possible when working on that particular type, but I do not see a way to have that information in the context of generating another type.

There is a "solution" to that problem though, you can use alterations and write whatever you want. But this is a manual process that requires looking at elm-bridge's representation ...

I think the best course of action is to fail as it was before, but while including a more helpful message on how to use the alteration thing. You would loose the newtype, but given that it doesn't work properly in Elm anyway, that would be alright I think.

bartavelle commented 6 years ago

Or just fix the newtype so that it produces a proper type declaration, as recursive stuff is perhaps an edge use case?

arbus commented 6 years ago

The type definition above isin't really recursive in the traditional sense, just as an annotation stating that this is an Id of Foo which shouldn't be compared with perhaps an Id of Bar even though they are both represented as integers.

I know we can use newtypes to get the same effect(albeit with more boilerplate) but the Haskell codebase that we are trying to share types with makes extensive use of this trick to annotate the Id of many different types.

I agree that this is somewhat of a corner case. For now, I am using the @ChShersh 's fix along with defining all the phantom types manually(there are only a couple) in a separate module and just including them in the generated file.

bartavelle commented 6 years ago

I will try something that works like @ChShersh 's fix then. There is no downside to it as the current (released) version crashes and the current (master) version produces invalid code.

bartavelle commented 6 years ago

I did not forget about this, I am just really busy with other things right now.

arbus commented 6 years ago

Appreciate the effort! We are currently using a workaround so its not a high priority issue for us, although it would be a good feature to have moving forward

mitchellwrosen commented 6 years ago

I have an extremely WIP library up that can handle this case, perhaps some of you will find it useful.

https://github.com/mitchellwrosen/haskell-to-elm

I just hacked it up on the plane the other day, it's only one module, just a basic recursive function over a Haskell type. I wasn't aware of this library when I wrote it. Here is the test suite of types it seems to handle correctly:

https://github.com/mitchellwrosen/haskell-to-elm/blob/24d125d302cc64095cff786f392160799188b72e/test/Types.hs

bartavelle commented 5 years ago

The problem of this library is that it is currently not well supported, but the point is that you can derive your code with the aeson options you like and it should work. There are other libraries that convert Haskell types to Elm, but none of them does that ..

saurabhnanda commented 5 years ago

@mitchellwrosen IIUC you're basically walking down the TH representation of a Haskell data-type and transpiling it to Elm, right? I was thinking of changing the internals of elm-bridge to do something similar. Currently, there seem to be many corner cases that it doesn't handle. For example, I'm not sure why newtype UserId = UserId Int is converted to type alias UserId = Int, and not type UserId = UserId Int.

Is there any reason why elm-bridge is not transpiling the TH representation to Elm?

Related to #40

mitchellwrosen commented 5 years ago

@saurabhnanda Yeah, just making an Elm type from a Haskell one at compile time. With an Elm type ADT in hand, any number of transformations can happen to it at the Haskell value level before rendering, such as field renaming and the type alias <--> type for records/newtypes you mentioned, where both encodings might be desirable under different circumstances.

My perception just before I hopped on the plane & wrote this was that even this basic transformation was not really standardized on, nor exposed by any Elm/Haskell library, so it seemed like the right place to start.

Obviously people want a lot more code gen than just type declarations (json encoding/decoding, servant clients, etc) - which is one possible separation of concerns (i.e. library boundaries) that makes sense to me.

bartavelle commented 5 years ago

The newtype idiom is supported, in a clumsy way, as described here. The reason it is not done by default is that I liked it better that way :)

Not sure why you are saying it is not transpiling the TH representation to Elm, as this is clearly done though? You mean, why is it using an intermediate representation?

saurabhnanda commented 5 years ago

The newtype idiom is supported, in a clumsy way, as described here. The reason it is not done by default is that I liked it better that way :)

Ah thanks! Didn't know that.

Not sure why you are saying it is not transpiling the TH representation to Elm, as this is clearly done though? You mean, why is it using an intermediate representation?

Actually, scratch that question. I'm neck-deep into reading 3 different Elm code-gen libraries and I got confused!