ThoughtWorksInc / DeepDarkFantasy

A Programming Language for Deep Learning
Apache License 2.0
462 stars 47 forks source link

Add Partial Evaluation #194

Closed izgzhen closed 7 years ago

izgzhen commented 7 years ago

https://github.com/ThoughtWorksInc/DeepDarkFantasy/issues/106

izgzhen commented 7 years ago

There are quite a few things worth discussing --

  1. Dropping liftEnv invalidate the last definition in DiffInf. I have no idea either about why liftEnv exists (it has a signature that I can't quite make sense of), or what is the exact mathematical implication of infinite diff.
  2. I can't make PE work with the original Double, which has a wrapped function signature, instead of a explicit HOAS-like one. I guess probably I am doing the wrong thing, and I am still thinking why this happens.
izgzhen commented 7 years ago

BTW, @MarisaKirisame take your time, no need to rush to comment about this :)

MarisaKirisame commented 7 years ago

0: DiffInf basically let you take Diff arbitrary amount of time (using laziness, it generate infinite amount of Diff). It is the only reason for liftEnv (Sorry, I forgot). 1: something seems strange. Using lambda/application, the two definition can convert to each other.

MarisaKirisame commented 7 years ago

2: So what can be done is, use finally tagless, remove liftEnv from DBI, into it's own typeclass. This will solve the incompatibility problem, but I suspect PE on liftEnv can be done. I will look into that.

izgzhen commented 7 years ago

Using lambda/application, the two definition can convert to each other.

It should ... but doesn't work with PE nicely. I will add more info on this tonight.

izgzhen commented 7 years ago

use finally tagless, remove liftEnv from DBI

You mentioned laziness -- will that be the deep reason why PE can't be derived?

izgzhen commented 7 years ago

And if we do this instead:

  doublePlus = lam2 f
    where -- f (Unk x) (Unk y) = Unk $ app2 doublePlus x y
          f (Static d1 _) (Static d2 _) = double (d1 + d2)
          -- f e1 e2 = Open (\h -> app2 doublePlus (app_open e1 h) (app_open e2 h))

The type checker will complain about the overlapping instances ...

MarisaKirisame commented 7 years ago

You mentioned laziness -- will that be the deep reason why PE can't be derived?

I dont know.

can you give the full err msg for overlapping?

izgzhen commented 7 years ago

Consider the derivation like below:

instance Double r => Double (P r) where
  double x | trace (M.unwords ["PE: D.double", show x]) M.True =
               Static x (double x)
  doublePlus = lam2 f
    where
      -- f :: forall h.
      --      (forall k. NT (P r) (M.Double, h) k => P r k M.Double) ->
      --      (forall k. NT (P r) (M.Double, (M.Double, h)) k => P r k M.Double) ->
      --      P r (M.Double, (M.Double, h)) M.Double
      f (Static d1 _ {- :: P r (M.Double, h) M.Double- -} )
        (Static d2 _ {- :: P r (M.Double, (M.Double, h)) M.Double -}) = double (d1 + d2)
      -- f (Unk x) (Unk y) = Unk $ app2 doublePlus x y
      -- f e1 e2 = Open (\h -> app2 doublePlus (app_open e1 h) (app_open e2 h))

here f will arise a type check error:

error:
    • Overlapping instances for NT (P r) (M.Double, h) t0
      Matching instances:
        instance [overlappable] [safe] forall k k1 (repr :: k1
                                                            -> k -> *) (l :: k1) (r :: k1).
                                       NTS repr l r =>
                                       NT repr l r
          -- Defined at /Users/zz/Repos/DeepDarkFantasy/DDF/DBI.hs:96:31
        instance [overlapping] [safe] forall k k1 (repr :: k1
                                                           -> k -> *) (x :: k1).
                                      NT repr x x
          -- Defined at /Users/zz/Repos/DeepDarkFantasy/DDF/DBI.hs:99:30
      (The choice depends on the instantiation of ‘r, h, t0’
       To pick the first instance above, use IncoherentInstances
       when compiling the other instance declarations)
    • In the first argument of ‘lam2’, namely ‘f’
      In the expression: lam2 f
      In an equation for ‘doublePlus’:
          doublePlus
            = lam2 f
            where
                f (Static d1 _) (Static d2 _) = double (d1 + d2)
MarisaKirisame commented 7 years ago

This happen because you are not using the environment. lam, lam2, lam3... automatically lift the variable with S, into the correct scope. Try to lift it manually, using De Bruijn Index.

izgzhen commented 7 years ago

This happen because you are not using the environment.

This is the tricky part -- sometimes I use the environment, sometimes I don't (which is the case where I can do PE). I need that information exposed to me somehow.

Another thought is, can I argue that the doublePlus's signature is weaker that what is needed to do PE? It is no different from any other function in that DSL, but it should be different.

I think I am stuck now :(

MarisaKirisame commented 7 years ago

This is the tricky part -- sometimes I use the environment, sometimes I don't (which is the case where I can do PE). I need that information exposed to me somehow.

You can try using only De Bruijn Index. No matter you're using the env or not, you're inferring everything by hand.

You might try to change it to see if something else work. I will be looking at the code, boarding a plane now.

dramforever commented 7 years ago

I wrote the NT stuffs here. I think we need to talk...

Only talking about doublePlus.

I believe there is a misunderstanding. The argument lam2 passes to f is always just a variable. The polymorphic type only makes it choose from z, s z, s (s z) etc. depending on the surrounding type. Case-spliting on them defeats the entire purpose of it. I suggest you construct this function by using abs twice, then directly peeking from the environment.

Or why not just doublePlus = Static (+) doublePlus?

izgzhen commented 7 years ago

Or why not just doublePlus = Static (+) doublePlus?

Because it doesn't really do partial evaluation. However, it does specialization of d1 and d2, which the following another definition doesn't.

You are right that the essence is to do case split over something, but what is something? What does "directly peeking from the environment" mean?

I can go this way:

instance Double r => Double (P r) where
  double x | trace (M.unwords ["PE: D.double", show x]) M.True =
               Static x (double x)
  doublePlus = hoas $ (\x -> hoas $ (\y -> f x y))
    where
      f (Static d1 _) (Static d2 _) = double (d1 + d2)

then

> runShow t2 [] 1

which is not working in that runtime exception saying "Non-exhaustive patterns in function f", which means both d1 and d2 are not specialized properly.

dramforever commented 7 years ago

Of course. hoas just passes z to your function, which has two problems:

  1. z is just an Open according to this, so it won't match Static
  2. You forgot to use s x.
dramforever commented 7 years ago

Your function needs to give a result directly, without knowing what the arguments will be. From what I know, partial evaluation needs to happen in app, not directly in doublePlus. So the result needs to tell app 'I'm doublePlus' somehow.

dramforever commented 7 years ago

hoas and lam aren't what you need here. You already know what z and s z are, so don't get the value just to case on it.

izgzhen commented 7 years ago

So the result needs to tell app 'I'm doublePlus' somehow.

Exactly. If the interface is just doublePlus' :: r h Double -> r h Double -> r h Double, then we won't have that much hassle.

Thanks a lot, I will see what I can do here.

izgzhen commented 7 years ago

Fixed, finally! It is a great learning experience. I will make it more elaborated later.

dramforever commented 7 years ago

So you figured it out! Congratulations!

izgzhen commented 7 years ago

@MarisaKirisame BTW, what is your current idea on handing liftEnv? I propose to make it a separate type class, though that might cause OrphanInstances or something.

Also, I failed to do liftEnv derivation for P class. I think I would need some help here.

MarisaKirisame commented 7 years ago

Back from lambdaconf. Make it a separate typeclass is good. Will look into it more tomorrow (catching up with hw)

dramforever commented 7 years ago

There will be problems because only types of the form (a1, (a2, ..., (an, ()) ... )) are environments. Making it more... valid requires using a type-level list, which is a big (but hopefully mechanical) refactoring, which may or may not be useful. I'm going to guess it's not useful, because, well, there's no need if you keep the 'tail' of the environment type arg polymorphic everywhere you'll never need this.

MarisaKirisame commented 7 years ago

I can rewrite it to type list if needed.

izgzhen commented 7 years ago

I've fixed all missing cases in binaryPE, and also updated the simple tests.

We have two problems left though:

  1. Do we need to automate the tests? Here are two things to test: where it does the real partial evaluation or not (i.e. is the amount computation less than before), and whether it does the PE in correct way.
  2. liftEnv: The question is, can we implement it with current definition? I tried a few times, all failed, so I kinda agree with what @dramforever said. However, we need to a more formal description on why current liftEnv definition is not compatible with P, and whether a type-list-based one would.
MarisaKirisame commented 7 years ago

1: Yes, it should be automated by travis 2: Can you show me some attempt? BTW, I got confused: I found P to be very ad hoc - why cant it be: Bottom, or a unknown value, or a one-step unroll? For example, P (Maybe a) will be Bottom, or unknown, or Maybe (P a) P (a -> b) will be bottom/unknown/ P a -> P b we can then match on P a to do all sort of stuff (P h x is represent by a hlist of P on H to P x) liftEnv will just ignore the given input hlist

izgzhen commented 7 years ago

I found P to be very ad hoc...

Do you mean something like below?

data P r h a where
  Bot :: P r h a
  Unk :: r h a -> P r h a
  Unroll :: K (P r h a) -> P r h (K a)

First, how should we represent "one-step unroll"? How will that interacts with De Brujin index? As far as I know, most of the complications in the Oleg's tutorial code come from de brujing index -- i.e. EnvT-based type environment transformation.

Can you give an illustration of your proposed PE process for a simple example, e.g. (\0 -> \1 -> 1 + 0)?

izgzhen commented 7 years ago

I've tried to inject trace like unsafe IO to track the times of evaluation performed by GHC runtime, but it seemed revealing some interesting problem:

https://gist.github.com/izgzhen/ea3bdbb571308dcb32295c7e215b39e1

If you run main, you will see

Progress: 1/2TestPE: DDF/Show.hs:(32,19)-(33,73): Non-exhaustive patterns in lambda

Using non-HOAS form, like abs (app2 doublePlus z t1) will be fine

izgzhen commented 7 years ago

Test added!

MarisaKirisame commented 7 years ago

Good... Let the test flow through you

MarisaKirisame commented 7 years ago

One step unroll will be represented by type family. GADT is not needed - data P r x = Bottom | Unknown (r () x) | Known (r () x) (Unroll r x) type instance Unroll r (x, y) = (P r x, P r y) type instance Unroll r Double = Double

MarisaKirisame commented 7 years ago

Interact with Enviroment - newtype PE r h x = PE { runPE :: mapP r h -> P r x } type instance mapP r () = () type instance mapP r (x, y) = (P r x, mapP r y) De Bruijn Index is simply zro, fst...

MarisaKirisame commented 7 years ago

app (app plus (lit 1)) (lit 2) plus :: PE r h (Double -> Double -> Double) plus = PE (\h -> Known plus plus1) plus1 :: P r Double -> P r (Double -> Double) plus1 Bottom = Bottom plus1 (Unknown x) = Unknown (app plus x) plus1 (Known x) = Known (app plus (lit x)) (plus2 x) plus2 :: Double -> P r Double -> P r Double plus2 x Bottom = Bottom plus2 x (Unknown y) = Unknown (app (app plus (lit x)) y) plus2 x (Known y) = Known (lit (x + y)) (x + y)

MarisaKirisame commented 7 years ago

z :: PE r (x, h) x z = PE zro s :: PE r h x -> PE r (s, h) x s (PE hx) = PE (hx . fst) abs :: PE r (a, h) b -> PE r h (a -> b) abs (PE ahb) = PE (\h -> Known (\a -> ahb (a, h)) liftEnv :: PE r () x -> PE r h x liftEnv (PE x) = PE (const x)

MarisaKirisame commented 7 years ago

Some comment on the above:

0: Bottom look ugly since most code do not have bottom. Might be factored out By adding rec-self into mapP, Unroll, etc...

1: Unroll r [x] should not be [P r x] -> too much assumption it should be Maybe (P r (x, [x])) -> Unroll on the F-algebra form

2: The fact that oleg dont do this might suggest there is some flaw. It is unlikely that oleg overlook such approach and employ a more difficult one - he is very good with haskell hackery. I shall contact him about this once we had a version set up.

3: Known should also contain a copy of the object language term, just like P in Oleg's construct, to convert back to code.

izgzhen commented 7 years ago

Another example is (\x -> x + 1) 2, or verbosely, (\x -> app (app plus x) 2) 1.

In your scheme, my derivation would be:

app plus x => Unknown (app plus x) since x is unknown.

app (Unknown (app plus x)) 2) => Unknown (app (app plus x) 2)

abs (app (Unknown (app plus x)) 2) => PE (\h -> Unroll (\x -> app (Unknown (app plus x)) 2))

app (abs (app (Unknown (app plus x)) 2)) 1 => PE (\h -> app (Unknown (app plus 1)) 2

Hopefully there is no silly mistake since I am in a hurry to go out .... maybe I didn't really understand your scheme, but it looks like this doesn't do any PE?

izgzhen commented 7 years ago

But anyway, this would be an interesting thing to investigate into

MarisaKirisame commented 7 years ago

I had updated the plus example. getCode :: Lang r => P r x -> r () x getCode Bottom = bottom getCode (Unknown x) = x getCode (Known x ) = x app :: Lang r => PE r h (a -> b) -> PE r h a -> PE r h b app (PE ab) (PE a) = PE (\h -> appP (ab h) (a h)) appP :: Lang r => P r (a -> b) -> P r a -> P r b appP Bottom = Bottom appP (Unknown ab) a = Unknown (app ab (getCode a)) appP (Known ab) a = ab a The thing is, IR dont have to deal with HOAS - DBI is enough. Let's desugar it: (\x -> app (app plus x) 2) 1 app (\x -> app (app plus x) (lit 2)) (lit 1) app (abs (app (app plus z) (lit 2))) (lit 1) recall that z = PE zro, plus = PE (\h -> Known plus plus1) app plus z = app (PE (\h -> Known plus plus1)) (PE zro) = PE ((a, h) -> appP (Known plus plus1) a) = PE ((a, h) -> Known (app plus (GetCode a)) (plus1 a)) lit 2 = PE (\h -> (Known (lit 2) 2)) app (app plus z) 2 = app (PE ((a, h) -> Known (app plus (GetCode a)) (plus1 a))) (PE (\h -> (Known (lit 2) 2))) = PE ((a, h) -> appP (Known (app plus (GetCode a)) (plus1 a)) (Known (lit 2) 2)) = by plus1 (Known x) = Known (app plus (lit x)) (plus2 x) = PE ((a, h) -> Known (app plus (lit 2)) (plus2 2)) abs (app (app plus z) 2) abs (PE ((a, h) -> Known (app plus (lit 2)) (plus2 2))) = by abs (PE ahb) = PE (\h -> Known (\a -> ahb (a, h)) PE (\h -> Known (\a -> Known (app plus (lit 2)) (plus2 2))) damn, something is wrong, I will think about it

MarisaKirisame commented 7 years ago

got it. make PE into a term without environment by unliftEnv (SKI Combinator form) liftEnv to make in generic after PE is done

izgzhen commented 7 years ago

appP (Known _ ab) a = ab a

This rule is weird.

Known (app plus (GetCode a)) (plus1 a)

Known looks similar to Static in Oleg's formulation to me -- but for sake of being static, it should not contain any a. StaFun is another thing though.

PE ((a, h) -> appP (Known (app plus (GetCode a)) (plus1 a)) (Known (lit 2) 2)) = by plus1 (Known _ x) = Known (app plus (lit x)) (plus2 x) = PE ((a, h) -> Known (app plus (lit 2)) (plus2 2))

You somehow missed a I guess

izgzhen commented 7 years ago

@MarisaKirisame @dramforever https://gist.github.com/izgzhen/0fd345a5cbe24967150844c8f8f0675a comments welcomed!

MarisaKirisame commented 7 years ago

Cool! am working on formally verifying DDF right now, feel free to make more pr! :) 文章弄上知乎怎么样(

izgzhen commented 7 years ago

文章弄上知乎怎么样(

review 下确定基本没大问题我会发到我的带 mathjax 的博客上,到时候转发没问题啊~(虽然这篇比较 technical。。。