Open Blaisorblade opened 10 years ago
@Toxaris added:
Yes. But the design is not obvious. What happens if a file imports a file with a different #lang? Maybe you can open an issue and we discuss it on github?
Related to this: Is there a subtyping relationship between PTS instances?
One obvious solution would be to only allow imports of files which use the same PTS instance.
I guess there is also a simple "subtyping" relationship, which is basically a subset relationship of the sorts , axioms, and relations (plus renaming).
Sorry, mistook the "close" for a "cancel" button. Maybe that aspect of the github issues interface is just wrong, I saw it happen to others.
I guess the "subtyping" proposal would allow modules written in a powerful type system to import modules written in a less powerful type system, but not vice versa.
Or could one also allow the other direction somehow?
Or does all this not work anyway, e.g., because bad things can happen if one passes a higher-order function (similar to the complications of blame assignment with higher-order contracts)?
I guess there is also a simple "subtyping" relationship, which is basically a subset relationship of the sorts , axioms, and relations (plus renaming).
I guess the "subtyping" proposal would allow modules written in a powerful type system to import modules written in a less powerful type system, but not vice versa.
That's also my guess.
|-_PTS1 t : T
and PTS1 <= PTS2
, then |-_PTS2 t : T
— that is, PTS1 <= PTS2
means that the typing judgement in PTS2 is a conservative extension of the one in PTS1.
Proof: by structural induction on the typing derivation of |-_PTS1 t : T
. Let PTS_1 = (S1, A1, R1)
and PTS_2 = (S2, A2, R2)
as usual. The only interesting cases are for axiom and Pi types, and the proof is similar. The rule checks that something belongs to A1 (resp. R1), but then it belongs to A2 (resp. R2), so |-_PTS2 t : T
. All contained PTS1-derivations are PTS2-derivations by the inductive hypothesis. QED.
This should be even more obvious for (untyped) evaluation, since it's untyped.
If this holds, importing code in smaller instances is clearly sound — you can use the weaker system to typecheck them and use the results (types and values) in the bigger system.
One could also consider imports of bigger instances with less guarantees.
Given my experience with advanced Scala, I propose to be conservative and implement only something which is obviously correct.
The most conservative proposal would be to allow importing modules that have either no #lang
or the same one as the file including them — this should also be easiest to implement (since one doesn't need to switch instance midway — though even that is probably easy-ish).
One problem with all of the above is that in PTS we often add at some level a constant called notExpressible
, with a rule notExpressible : notExpressible
, and that invalidates the results I mentioned above: in particular, eval_PTS2 t != eval_PTS1 t
even if |- PTS1 t : T
and PTS1 <= PTS2
, because some occurrences of notExpressible
must be replaced by another constant.
Ultimately, this is because our typed terms don't encode strictly PTS typing derivation, and try to ensure that each term has a type, therefore always producing cyclic derivations for constants. Instead, the original Axiom rule does not require sorts to be well-typed themselves.
We could change this: we could define TypedTerm = ... | TypedConst C C
and take constants out of TermStructure
.
To reduce the impact on the codebase, we could still have accessors return notExpressible
as long as we don't store it in the tree, and use GHC pattern synonyms to ease the transition (we could also use views and then update matchers with sed
).
The most conservative proposal would be to allow importing modules that have either no #lang or the same one as the file including them.
I'm considering to do this. The subtyping stuff sounds cool, too, but maybe too complicated.
A crazy idea about PTS interoperability:
If we consider a setting with subtyping (and renaming) as described above this would enable some potentially cool things:
For instance, one could write a library in STLC and then import it into another module which uses the STLC on the type level.
If the language of a module supports a superset of STLC on multiple universes, one could even import the STLC module multiple times (one would have to specify in the import on which level it is supposed to be imported).
I'm not quite sure how this idea relates to universe polymorphism and cumulativity.
Another thought about the idea of a reverse subtyping relation: Let's say we have a STLC module importing a System F module. Would it be sound to allow the STLC module to use terms from the System F module that happen to have a STLC type? I wouldn't expect any type soundness problems since you can typecheck the composed program in the more powerful type system, but maybe adding additional typing rules can invalidate other invariants, such as parametricity?
Potential issues with importing from bigger PTS (that is, PTS with more sorts, rules or axioms):
What do you mean by "not functional"?
See this comment about the restriction of our type checker to "full and functional" PTS. I think you put it there when you wrote the PTS type checker back in the summer of 2008.
Ah, now I remember vaguely. Functional PTS was something that implied uniqueness of types. Full PTS was something about making the typing rule for functions syntax-directed. I agree that PTS composition does not preserve these properties. I also agree that it is probably not problematic because we would type-check each file in "its" type system only.
I think your example with strong normalization is useful, though.
I wonder whether it also works the other way around: Is it possible that a smaller PTS is not strongly normalizing but a bigger one is? The answer is a clear "no" if PTS1 < PTS2 means that PTS1 accepts only a subset of the programs. Is PTS extension (adding axioms, relations etc.) monotonic with respect to the set of well-typed programs?
I wonder whether it also works the other way around: Is it possible that a smaller PTS is not strongly normalizing but a bigger one is? The answer is a clear "no" if PTS1 < PTS2 means that PTS1 accepts only a subset of the programs. Is PTS extension (adding axioms, relations etc.) monotonic with respect to the set of well-typed programs?
That's what I proved when I talked about "conservative extension" :-) I can explain better in person.
I also agree that it is probably not problematic because we would type-check each file in "its" type system only.
Does this preserve useful PTS properties, such as soundness, or uniqueness of types, or other ones used in the implementation? I think that's less than obvious. If the union of the PTSs is full and functional, instead, the composed program is automatically a program in the bigger PTS. So we could check that the union of all used PTSs satisfies these properties.
For instance, one could write a library in STLC and then import it into another module which uses the STLC on the type level.
If the language of a module supports a superset of STLC on multiple universes, one could even import the STLC module multiple times (one would have to specify in the import on which level it is supposed to be imported).
I think that sounds cool :-)
I'm not quite sure how this idea relates to universe polymorphism and cumulativity.
Universe polymorphism seems a restriction of this to the case where "renaming" is specialized to "add k to the constants" and all levels are the same.
TL; DR. Cumulativity is not really relevant, it addresses similar problems as universe polymorphism, but it's technically very different. It's just subtyping among universes (which destroys uniqueness of typing). AFAIK PTSs aren't cumulative per se, but I think you can expand a PTS to a "cumulative" one, which cannot be functional (because we don't get uniqueness of typing), with a size explosion. (E.g. for every axiom/rule with resulting sort _k, you add copies of the rule with resulting sort _j for all j > k, so that every value is also a type).
Another thought about the idea of a reverse subtyping relation: Let's say we have a STLC module importing a System F module. Would it be sound to allow the STLC module to use terms from the System F module that happen to have a STLC type?
That's even overly conservative. A scenario I and @Toxaris discussed is having DSLs based on smaller PTS but using code from a bigger PTS. Imagine STLC with polymorphic lists. You could define polymorphic lists in System F omega (with Church-encoding), and then you could use them in STLC. That just works because, luckily enough, the PTS instance is only used for typechecking Pi, not for application (in fact, not even for lambdas) — in this interpreter, it seems we can already add constants which aren't sorts, and this would work just as well.
I wouldn't expect any type soundness problems since you can typecheck the composed program in the more powerful type system, but maybe adding additional typing rules can invalidate other invariants, such as parametricity? [... See also discussion of strong normalization...]
Without looking at the code in the more powerful system, you only get the (possibly weaker) guarantees of the bigger system, in general. I imagine that if the code in the more powerful system is somehow well-behaved, then you can keep the guarantees, but I have no clear idea what "well-behaved" would mean (for instance, I suspect non-looping terms from the bigger system might not destruct strong normalization, but I'm not sure that's actually true).
I don't understand how people use "more powerful system" in this issue. Can we try to talk about "small system" and "big system", where smaller = less sorts, axioms and/or rules?
I fear a non-looping term from a not strongly normalizing language might still lead to nontermination at its use site.
Above, we seem to assume that the sub-PTS relationship is defined by taking the subset of the sorts, axioms and relations.
But I feel that sub-PTS should form a lattice with lambda star at the top and the empty PTS at the bottom. Rationale: If you can type a term in any PTS at all, you can type it in lambda-star by replacing all sorts with *. So lambda-star should be the top element of a lattice.
I don't know how to reconcile these two intuitions of the sub-PTS relationship.
I don't understand how people use "more powerful system" in this issue. Can we try to talk about "small system" and "big system", where smaller = less sorts, axioms and/or rules?
Yes. AFAICS, we used "more powerful" to mean bigger.
I fear a non-looping term from a not strongly normalizing language might still lead to nontermination at its use site.
Yes, that's what I fear, so figuring out general conditions is probably hard.
Above, we seem to assume that the sub-PTS relationship is defined by taking the subset of the sorts, axioms and relations.
Indeed, both I and Klaus are explicit on that. If you fix a PTS and consider its subsets, you have the product of three powerset lattices (is that a lattice)?
But I feel that sub-PTS should form a lattice with lambda star at the top and the empty PTS at the bottom. Rationale: If you can type a term in any PTS at all, you can type it in lambda-star by replacing all sorts with *. So lambda-star should be the top element of a lattice.
I don't know how to reconcile these two intuitions of the sub-PTS relationship.
The subset criterion is a syntactic approximation of "typeable in PTS1 is typeable in PTS2", which is a semantic criterion. Two semantically PTSs might be syntactically incomparable under the subset criterion, but if they're subset-comparable then they compare semantically the same way. So it seems that the latter criterion is just a bigger relation — which seems hard to reason about though (a bit like showing observational equivalence is harder than showing beta-equivalence). Under the subset criterion, lambda-star is incomparable to any PTS where * has a non-* type. See (NOTE).
Syntactically speaking, I think lambda* can be obtained by merging sorts (and updating rules correspondingly) from any other PTS — and this preserves typeability. Allowing also for merging allows to compare syntactically more PTS pairs, but I'm not sure this is a complete criterion or we have still a approximation. (Say, if you have any PTS with star and box, and merge them by replacing box with star, you get lambda*).
The shape of this "semantic" lattice is also (mathematically) interesting, because it seems you can go an infinite number of steps in any direction before getting to lambda* (for instance, consider System F, F_2, F_3... F_omega — right? If F_2 is a PTS). I have no idea if there's something immediately below lambda-star. Maybe you can go down from lambda* by moving the circular rule to higher-level? So Fomega* is above the infinite progression of F_i systems, but below lambda*? (So it seems this sequence probably gives rise to some crazy ordinal number — we are already at omega+omega; in fact, I imagine we can probably add kind-polymorphism to f-omega and go on another infinite progression).
(NOTE) I guess that means "all PTS but the empty one, and maybe a small, finite number of exceptions, and (possibly infinite) crazy PTSs where * : * and you have more useless rules".
If you can type a term in any PTS at all, you can type it in lambda-star by replacing all sorts with *.
Really? That's not obvious to me. Do you have a reference?
I think this is not true in general. For instance, a PTS may have additional axioms which lambda-star does not have. Maybe it would work if one would add the axiom |- c : *
for every axiom |- c : s
in the other PTS.
I think this is not true in general. For instance, a PTS may have additional axioms which lambda-star does not have. Maybe it would work if one would add the axiom |- c : * for every axiom |- c : s in the other PTS.
I think the idea only makes sense (as is) for systems where all constants are sorts. Otherwise you'd need those rules, yes.
I think you can prove the result similarly to what I did above: this proof will requires care but should not be long. You need to translate terms to replace sorts with *
, and you need to show that after translating a typing derivation, all sorts become *
. The less obvious part is the application rule.
I've skimmed "Lambda calculi with types", and it only seems to mention that lambda* is a collapsed version of lambdaC (the calculus of constructions I guess).
Full PTS was something about making the typing rule for functions syntax-directed.
"Lambda calculi with types" says that "full" means "having relations across all sorts" and says that lambda2 (System F) is not full. So I don't believe the restriction to full PTSs is still there.
So I don't believe the restriction to full PTSs is still there.
OK, the referenced paper talks about semi-full PTSs.
I wrote:
If you can type a term in any PTS at all, you can type it in lambda-star by replacing all sorts with *.
@klauso asked:
Really? That's not obvious to me. Do you have a reference?
No reference, I picked this up in some discussion at a conference. You and @Blaisorblade seem to correct in that it only works if there are no non-sort constants, or that one has to add c : *
axioms for all constants non-sort constants c
.
More formally, I would phrase it like this: Given two sets of constants C1
and C2
, corresponding PTS instances (S1, A1, R1)
and (S2, A2, R2)
, a function f : C1 -> C2
so that
f c
in S2
for all c
in S1
,(f c1, f c2)
in A2
for all (c1, c2)
in A1
and(f c1, f c2, f c3)
in R2
for all (c1, c2, c3)
in R1
,two terms A
and B
and a context Γ
with Γ ⊢ A : B
, then f Γ ⊢ f A : f B
where f
is extended to map structurally over contexts and terms.
In other words, if f
preserves the structure of PTS instances, it also preserves typeability.
Actually, given any PTS, the function f _ = *
is a structure-preserving map from the constants of that PTS to λ, so λ is the top element of the order induced by the existence of structure-preserving maps. We just have to translate all constants (including non-sort constants) to *
.
It still sounds strange that there is a "maximal" type system within PTS. I wonder whether lambda-* rejects enough programs to be still useful, then.
Another "maximal" type system in some sense is a lambda calculus with recursive types. Any untyped LC term can be typed by typing everything as mu T. T -> T
. But, arguably, too many programs are accepted this way. For instance, with this approach one cannot Church-encode numbers and type-check that a function will, say, return a number.
Does Lambda-* suffer from a similar problem? For instance, if one would define a type for Church numerals, would some terms that are not supposed to type-check as Church-numerals still have that type?
All I know is: λ is type-safe and λ admits non-normalising terms.
What I guess: λ* does not admit typed self-representation.
For instance, with this approach one cannot Church-encode numbers and type-check that a function will, say, return a number.
I think the only problem you get is nontermination. The type of Church-numerals still exists — you get nonterminating programs, but you keep type soundness. You have mu T. T -> T
as alternative, but if you use that type for a Church-numeral, I think it's your choice. Especially, I'm even less sure how that would translate to lambda*.
Does Lambda-* suffer from a similar problem? For instance, if one would define a type for Church numerals, would some terms that are not supposed to type-check as Church-numerals still have that type?
Since lambda* is type-safe, you will never get a type error. OTOH, if you could inhabit bottom (which is not known to this day, AFAIK — only looping combinators are known, not fixpoint ones), you could use it to write nonTerminatingCast : a -> b
and use it inside a term of type Church-numeral, so this type (any type) is enlarged by exotic, nonterminating terms.
Still, not sure that's a problem.
We can have nonterminating terms without having a fixed point combinators. For example, we can use a looping combinator to implement the nonterminating term :)
The difference between looping combinators and fixed point combinators is the type of the recursive calls, not the dynamic behavior.
We can have nonterminating terms without having a fixed point combinator. For example, we can use a looping combinator to implement the nonterminating term :)
Sure, but with arbitrary types? In this case yes: "Type" is not a type, and http://www.cs.cmu.edu/~kw/scans/hurkens95tlca.pdf, explain that Girard's paradox yields a term of type "forall T : *. T", that is, _|_
. So yes, we can in particular implement a non-terminating cast.
(I thought again about PLDI'09's footnote, but the problem is that we'd need a terminating general recursive type-level program to write a non-terminating term in it; so we could have bottom without having non-terminating terms in Fomega*).
I'm a bit lost. What footnote? At least in a call-by-value interpretation, if we have non-terminating terms at all, we can smuggle them into terms of arbitrary types by beta-expansion.
I'm a bit lost. What footnote? What footnote? What footnote??? It's "the footnote" from
footnote.txt
:laughing: :
https://github.com/Toxaris/typed-self-representation/blob/master/journal-version/footnote.txt#L6
At least in a call-by-value interpretation, if we have non-terminating terms at all, we can smuggle them into terms of arbitrary types by beta-expansion.
As far as we know, in Fomega* we only have non-terminating types, so it's an open problem whether there are non-terminating terms.
To quote the original footnote:
"It would be interesting, though, to analyze whether the inconsistency is confined to the kind system or whether it leaks into the type system".
I don't really understand how the question of strong normalization in fomegastar applies to this issue.
+1
I don't really understand how the question of strong normalization in fomegastar applies to this issue.
No it doesn't: it's just that we learned about looping combinators in that context.
One problem with all of the above is that in PTS we often add at some level a constant called notExpressible, with a rule notExpressible : notExpressible, and that invalidates the results I mentioned above: in particular, eval_PTS2 t != eval_PTS1 t even if |- PTS1 t : T and PTS1 <= PTS2, because some occurrences of notExpressible must be replaced by another constant.
Ultimately, this is because our typed terms don't encode strictly PTS typing derivation, and try to ensure that each term has a type, therefore always producing cyclic derivations for constants. Instead, the original Axiom rule does not require sorts to be well-typed themselves.
@Toxaris fixed the above in 3d3be1db5fcdee2171466808c38df524fbf1287c :-)
That fix should make the following idea easier to try out (I expect the change should be easy and small, but should be experimental).
importing code [from] smaller instances is clearly sound — you can use the weaker system to typecheck them and use the results (types and values) in the bigger system.
One obvious solution would be to only allow imports of files which use the same PTS instance.
I hope we could implement this meanwhile, to avoid hacks like ce2555670c79a74b512f9f837ea934c4b3a21486 (which is the best we have).
I would be happy to have some form of optional language foo;
statement and checking that all imported modules use the same language as the importing module, but this has a low priority for me. You want to implement this, @Blaisorblade?
I've split out issue #111 for the language foo
statement, so that we can think of the language foo;
statement and PTS subtyping separately.
Meanwhile, I've realized an implementation issue for PTS subtyping. (Implementing that looked like more fun, though one might want a separate option for it).
I planned to convert instances to be data structures to allow checking subsetting (because, well, tha seems the easiest thing), but then I ran into infinite PTSs (fomegaomega).
Can we implement subtyping for such infinite PTSs? I guess not directly — given their descriptions, we can prove or disprove the subset relation, but not more.
Arguably we should have also an infinite PTS for calculi with infinite universes and universe polymorphism, but it's not there yet. So for now I could just implement a check whether a finite PTS is a subset of an infinite one.
So for now I could just implement a check whether a finite PTS is a subset of an infinite one.
Done. More precisely, I can check whether a finite PTS is a subset of another one in the obvious way.
@klauso wrote:
We've discussed this multiple times, but I couldn't find an issue for it.