jaccokrijnen / plutus-cert

0 stars 2 forks source link

Big-step: support datatypes ADTS #12

Open jaccokrijnen opened 1 year ago

jaccokrijnen commented 1 year ago

Current idea (see post below for considerations)

image

Additionally, I think that both language constructs need to be annotated with the ADT they are constructors of, otherwise I don't see how to define the typing rules for them.

jaccokrijnen commented 1 year ago

After some brainstorming with Wouter, we had the following three options

The last option seems the nicest. We substitute in Constr and Match nodes for the introduced variables (or eta-expanded versions) in the big-step rules. And add rules like:

t [ (\x y z -> Constr 1 [x,y,z]) / C1] ... [.../C_n] [(\e b1 ... bn -> Match e [b1 ... bn] /T_match]
-----------------------------------------------------------
let data T = C_1 T_1_1 T_1_2 T_1_3 | ... | C_n with T_match in t

t => Constr i vs     bs_i vs => v
---------------------------------------- [Match-eval]
Match t [b_1 ... b_n] => v

t_i => v_i   (for all i)
--------------------------------------- [Constr-value]
Constr i [t_1 ... t_n] => Constr i [v_1 ... v_n]
jaccokrijnen commented 1 year ago

Phil recommended to look at existing literature and reuse any of those. This will probably look like an environment-based semantics for the ADTs. Michael was also in favor of environment based, even if PIR is to include sums of products. The compiler may choose how to compile ADTs (either scott, or sums-of-products), and just picking either is not really appropriate for the semantics at PIR's level.

A constructor as value only has a meaning under a datatype environment (a bit like a closure).

What will the relation look like?

ADTS |- t => (ADTS, v)

where the first ADTS are in scope and the second are part of the closure?

jaccokrijnen commented 1 year ago

image

jaccokrijnen commented 1 year ago

Compatibility lemma of let data

A challenging bit is to prove the compatibility lemma for Data binds. In this case, we need to prove that substituting in encoding $e$ is semantically the same as substituting in another encoding $e'$. This comes down to showing that all corresponding encodings of constructors and match-functions logically approximate each other.

Parametricity

The type introduced by a $let data$ is represented as a type variable, which we could see as a polymorphic type + its operations (constructors and match-function). An encodings instantiates the datatype to a concrete representation and may of course use different types to do that (e.g. sum-of-products or scott-encoding). In our case, by definition of a valid encoding, we know that the operational semantics should behave identically.

Our logical relation is equipped with a notion of parametricity, because the substitution of type variables contains two syntactic substitutions (i.e. the possibly different instantiations of the type variable), as well as a semantic substitution, relating two types.

TODO [ ] Go through Amal's lectures/notes to see how a parametricity proof works in the context of LRs [ ] Implement proof for valid encoding

jaccokrijnen commented 1 year ago

For now: fix encoding in logical relation

We can fix the encoding to be equal between pre- and post semantics.

Alternative to parametricity proof?

Wouter and I discussed the possibility of proving it in a different way, if we can prove encoding independence:

$$ t \Downarrowe v \Rightarrow t \Downarrow{e'} v' \wedge \mathcal{RV}[[ \tau ]]_\rho(k, v, v') \hspace{10em} \text{(Encoding independence)} $$

And use a compatibility lemma that has fixed encodings.

During the discussion we were not really clear on the relation that should hold between v and v', but I think now that it should just be the logical relation $\mathcal{RV}[[ \tau ]]_\rho(k, v, v')$, since this is part of the definition of the approximation.

image