google-research / dex-lang

Research language for array processing in the Haskell/ML family
BSD 3-Clause "New" or "Revised" License
1.58k stars 106 forks source link

Typeclass instance methods cannot reference other methods in the same instance #323

Open dan-zheng opened 3 years ago

dan-zheng commented 3 years ago

The following Dex code doesn't compile:

interface Show a:Type where
  show : a -> String
  showList : List a -> String

-- Alternative type:
-- def showListDefault (show: Show a) ?=> (list: List a) : String =
def showListDefault (show: a -> String) (list: List a) : String =
  (AsList _ xs) = list
  lbracket = AsList _ ['[']
  rbracket = AsList _ [']']
  strings = map show xs
  result = lbracket <> (reduce nil (<>) strings) <> rbracket
  result

instance int32Show : Show Int32 where
  show = \x: Int32.
    (n, ptr) = %ffi showInt32 (Int & CharPtr) x
    AsList n $ for i:(Fin n).
      MkChar $ %ptrLoad (%ptrOffset ptr (ordinal i))

  -- Error: showList cannot reference show (defined in the same instance), which is unfortunate.
  showList = showListDefault show
> Type error:Couldn't synthesize a class dictionary for: (Show Int32)
>
>   showList = showListDefault show
>                              ^^^^

The same code compiles in Haskell (via GHC):

class Show' a where
  show' :: a -> String
  showList :: [a] -> String

instance Show' Int where
  show' _ = "" -- dummy implementation

  -- showList references show'. This is legal in Haskell and would be nice in Dex.
  showList list = foldl (++) "[" strings ++ "]" where
    strings = map show' list

Maybe we can learn from how the Haskell program above is represented in GHC's Core language.

apaszke commented 3 years ago

This is because we implement interfaces using records, and those cannot have any dependencies between their fields. If the methods in a typeclass form a DAG, then we could theoretically pack them in an ADT, but we have no way of handling mutually-recursive definitions (Dex does have those).

dan-zheng commented 3 years ago

Thanks for the explanation Adam!

Some follow-up questions:

duvenaud commented 2 years ago

I have a use case for this feature. I'm trying to generalize some numerics code to general manifolds, so I want to introduce a Basis typeclass that enumerates elements of a basis of a vector space. Something like:

interface [VSpace a] Basis a
  ix : Type
  basis : ix=>a

I can sort of get around this by using a List instead, but then all the types are needlessly generic and there's lots of wrapping and unwrapping to and from lists:

interface [VSpace a] Basis a
  basis : List a

instance Basis Float
  basis = AsList 1 [1.0]

instance {n} Basis (n=>Float)
  basis =
    sizen = size n
    btab = for i.
      yieldState zero \d.
        d!i := 1.0
    AsList sizen $ unsafeCastTable (Fin sizen) btab

One nice thing that this would allow is to write jacobian without specializing it to tables of Floats:

def jacobian {a b} [Basis a, VSpace a, VSpace b] (f:a->b) (x:a) : List b =
  j = jvp f x
  (AsList _ basisTab) = basis
  AsList _ for b. j basisTab.b

:p jacobian sum [1.0, 1.1]
(AsList 2 [1., 1.])

But I think I'd rather have jacobian return a table than a List. And in my actual use case, the number of dimensions in the basis shows up all over the place in the types. However this isn't a blocking issue for me - for now I can just use tables of floats everywhere.

apaszke commented 2 years ago

Yeah it's something I actually have already started working on, but then had to move on to other things... A simple workaround for fields that reference each other is to split the class into multiple classes:

interface BasisIx a
  basisIx a : Type

interface [BasisIx a] Basis a
  basis : (basisIx a)=>a

That should unblock you from the issue you've mentioned here, but I'm afraid that this approach will run into many limitations of our type inference soon once you start using it. But then we'll know more about what we need to fix!

duvenaud commented 2 years ago

Ah, thanks for unblocking me! Yes, I have already hit one error message along the lines of "this would require more sophisticated dictionary / type inference".

duvenaud commented 2 years ago

Heh, I hit the limits of type inference sooner than expected! Your example doesn't work:

interface BasisIx a
  basisIx a : Type

interface [BasisIx a] Basis a
  basis : (basisIx a)=>a

gives

Type error:Couldn't synthesize a class dictionary for: (BasisIx a)

  basis : (basisIx a)=>a
           ^^^^^^^^^^^^^

But again don't sweat it, this isn't a blocking issue for anything.

dougalm commented 1 year ago

A simple version of this wouldn't be too hard to implement. I'm particularly thinking about the case where you only refer to previous methods within the same class (according to the ordering given by the interface definition, which might be different from the order in the instance definitions). That way we don't have to worry about toposorts and cycles.

Currently, in inferUDecl, we first synthesize the dictionaries required in the body of an instance, and if that succeeds we then emit the instance. So the instance itself isn't available during dictionary synthesis:

  def' <- synthInstanceDef def
  instanceName <- emitInstanceDef def'

Instead, we could emit the partially-formed instance first (i.e. the instance without any methods), and then update the instance with its methods one-by-one, allowing earlier methods to be used in later methods. We'd have to take some care to give good error messages in the case that you try to access a method that violates the top-to-bottom topological dependencies. If we're not careful, it'll be possible to write recursive programs that will make the compiler spin its wheels forever.