Open jaccokrijnen opened 2 years ago
After some brainstorming with Wouter, we had the following three options
The last option seems the nicest. We substitute in Constr
and Match
nodes for the introduced variables (or eta-expanded versions) in the big-step rules. And add rules like:
t [ (\x y z -> Constr 1 [x,y,z]) / C1] ... [.../C_n] [(\e b1 ... bn -> Match e [b1 ... bn] /T_match]
-----------------------------------------------------------
let data T = C_1 T_1_1 T_1_2 T_1_3 | ... | C_n with T_match in t
t => Constr i vs bs_i vs => v
---------------------------------------- [Match-eval]
Match t [b_1 ... b_n] => v
t_i => v_i (for all i)
--------------------------------------- [Constr-value]
Constr i [t_1 ... t_n] => Constr i [v_1 ... v_n]
Phil recommended to look at existing literature and reuse any of those. This will probably look like an environment-based semantics for the ADTs. Michael was also in favor of environment based, even if PIR is to include sums of products. The compiler may choose how to compile ADTs (either scott, or sums-of-products), and just picking either is not really appropriate for the semantics at PIR's level.
A constructor as value only has a meaning under a datatype environment (a bit like a closure).
What will the relation look like?
ADTS |- t => (ADTS, v)
where the first ADTS are in scope and the second are part of the closure?
let data
A challenging bit is to prove the compatibility lemma for Data binds. In this case, we need to prove that substituting in encoding $e$ is semantically the same as substituting in another encoding $e'$. This comes down to showing that all corresponding encodings of constructors and match-functions logically approximate each other.
The type introduced by a $let data$ is represented as a type variable, which we could see as a polymorphic type + its operations (constructors and match-function). An encodings instantiates the datatype to a concrete representation and may of course use different types to do that (e.g. sum-of-products or scott-encoding). In our case, by definition of a valid encoding, we know that the operational semantics should behave identically.
Our logical relation is equipped with a notion of parametricity, because the substitution of type variables contains two syntactic substitutions (i.e. the possibly different instantiations of the type variable), as well as a semantic substitution, relating two types.
TODO
Discussion on parametricity for proving adequacy of a different encoding
Lecture by Leroy on parametricity
[TAPL 2, Chapter 7]
We can fix the encoding to be equal between pre- and post semantics.
Wouter and I discussed the possibility of proving it in a different way, if we can prove encoding independence:
$$ t \Downarrowe v \Rightarrow t \Downarrow{e'} v' \wedge \mathcal{RV}[[ \tau ]]_\rho(k, v, v') \hspace{10em} \text{(Encoding independence)} $$
And use a compatibility lemma that has fixed encodings.
During the discussion we were not really clear on the relation that should hold between v and v', but I think now that it should just be the logical relation $\mathcal{RV}[[ \tau ]]_\rho(k, v, v')$, since this is part of the definition of the approximation.
Current idea (see post below for considerations)
Additionally, I think that both language constructs need to be annotated with the ADT they are constructors of, otherwise I don't see how to define the typing rules for them.