dreixel / generic-deriving

BSD 3-Clause "New" or "Revised" License
44 stars 32 forks source link

Specifying concrete types for makeRep0Inline doesn't work #83

Open lierdakil opened 2 years ago

lierdakil commented 2 years ago

I've made a small project illustrating the issue: https://github.com/lierdakil/generic-deriving-issue

Given some types

data OtherData = OtherData
data SomeData a = SomeData a

I'm trying to make an instance:

instance Generic (SomeData OtherData) where
  type Rep (SomeData OtherData) = $(makeRep0Inline ''SomeData [t|SomeData OtherData|])
  from = undefined
  to = undefined

This, however, fails with a pretty confusing error message:

    • The exact Name ‘a_i2J2’ is not in scope
        Probable cause: you used a unique Template Haskell name (NameU), 
        perhaps via newName, but did not bind it
        If that's it, then -ddump-splices might be useful
    • In the untyped splice:
        $(makeRep0Inline ''SomeData [t| SomeData OtherData |])                

-ddump-splices is indeed useful here:

    makeRep0Inline ''SomeData [t| SomeData OtherData |]
  ======>
    D1 ('MetaData "SomeData" "Test" "main" 'False) (C1 ('MetaCons "SomeData" 'PrefixI 'False) (S1 ('MetaSel 'Nothing 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedLazy) (Rec0 a_i2J2)))

You might notice, the splice is referencing a type variable a_i2J2, which is not on the left-hand side of the type instance definition (because I've used a concrete type, OtherData as an argument).

The expectation that this should work. But it doesn't, likely due to a bug or an oversight. Maybe I'm misunderstanding something. If so, please explain.

Cheers!

RyanGlScott commented 2 years ago

This is indeed a bug. Currently, makeRep0Inline really only works if you're writing an instance of Generic (T a_1 ... a_n), where a_1 ... a_n are all distinct type variables. (This isn't clear at all from reading the Haddocks for makeRep0Inline, unfortunately.) Your instance would likely work if you wrote Generic (SomeData a) instead of Generic (SomeData OtherData). Alternatively, you could use deriving instance Generic (SomeData OtherData).

Is this bug fixable? In some cases, perhaps. It would be annoying to make it work in the general case, however. Consider this example:

newtype S (a :: k) = MkS k (Proxy a)

instance Generic (S Bool) where
  type Rep (S Bool) = $(makeRep0Inline ''S [t| S Bool |])
  ...

Currently, makeRep0Inline generates a Rep instance that looks (approximately) like Rec0 k :*: Rec0 (Proxy a). In order to be correct, however, it would really need to be Rec0 Type :*: Rec0 (Proxy Bool). In other words, we'd somehow need to compute a substitution that maps a to Bool and k to Type. Figuring out that a maps to Bool would be straightforward enough, but figuring out that k maps to Type would be tricky, since the original program does not mention Type anywhere. You'd have to perform type inference in Template Haskell to accomplish this, and doing type inference in TH is notoriously tricky.

lierdakil commented 2 years ago

You'd have to perform type inference in Template Haskell to accomplish this

What if we specify kinds explicitly? E.g.

type Rep (S Bool) = $(makeRep0Inline ''S [t| S (Bool :: Type) |])
RyanGlScott commented 2 years ago

It would still be rather fragile. The reason that makeRep0Inline works today is because it gathers all of the type variables it encounters within [t| ... |] from left-to-right. It's dumb, but very predictable.

When you're allowed to instantiate type variables, however, things become trickier. Consider [t| T (Bool :: Type) (Char :: Type) |], for instance. Which of those types are instantiating type variables? That depends on what the definition of T is. You'd get entirely different substitutions depending on whether you define T as T (a :: Type) (b :: Type), T (a :: j) (b :: Type), T (a :: Type) (b :: k), or T (a :: j) (b :: k). To put it differently: you'd have to remember the positions of each type variable in the original data type declaration and somehow figure out what types appear in the same positions in the [t| ... |] type.

Again, I don't think this would be impossible to do, but it require a great deal of care to make sure the implementation is right.