uwplse / pumpkin-pi

An extension to PUMPKIN PATCH with support for proof repair across type equivalences.
MIT License
49 stars 9 forks source link

Lifting large unpacked constants is very slow #44

Open tlringer opened 5 years ago

tlringer commented 5 years ago

There should be some optimization for this

tlringer commented 5 years ago

I guess this is not really a bug, so much as a consequence that DEVOID's cache doesn't understand unpacked constants, and so when it sees projT1 x in a body, and x is the lifted version of y, but only the unpacked version of x defined by z := projT2 x has been lifted again, it doesn't understand that x has already been lifted and so it does all of that work from scratch.

Consequentially, lift, lift, lift, unpack, unpack, unpack is much faster for large constants than lift, unpack, lift, unpack, lift, unpack, since caching is much better in the first case.

tlringer commented 5 years ago

So, e.g.:

Unpack __bst20' as __bst20.
Unpack __bst40' as __bst40.
Lift __bst _bst in __bst20 as _bst20'.
Lift __bst _bst in __bst40 as _bst40'.
Print __bst40.
(* projT2 __bst40' *)
Print __bst40'.
(* existT _ 
    (projT1 __bst20')
    (__Branch (projT1 __bst20') (projT1 __bst20') 2
              (projT2 __bst20') (projT2 __bst20')) *)
Print __bst20.
(* __bst20 = projT2 __bst20' *)

When DEVOID gets to lifting __bst40, DEVOID's cache has no clue that it has already lifted __bst20' when it lifted __bst20. That information is stored in the local cache, but not in the global cache, since we don't want to store every single constant that has been lifted already.

I'm not sure what a good way to improve caching for this is, but here are some bad ways:

To use the unpacked cache regardless of how we construct it, when we go to lift something that references __bst20', the unpacked cache will understand that __bst20 is the unpacked version of __bst20'. Thus, lifting can substitute in existT _ (projT1 __bst20') __bst20 for __bst20'. Then supposed we are lifting __bst40:

 existT _ 
    (projT1 __bst20')
    (__Branch (projT1 __bst20') (projT1 __bst20') 2
                      (projT2 __bst20') (projT2 __bst20'))`

We can substitute __bst20 for projT2 __bst20':

 existT _ 
    (projT1 __bst20')
    (__Branch (projT1 __bst20') (projT1 __bst20') 2 __bst20 __bst20)

And then we can lift the cached versions of those instead.

Regardless, I think this an inconvenience but not too bad, so what I am going to do is add a bit to the README that notes we should unpack after lifting fully, and then plan to address this in the code after.

The large branch has code that is a WIP on this issue, including test cases both ways.