Toxaris / pts

Interpreter for functional pure type systems.
BSD 3-Clause "New" or "Revised" License
21 stars 7 forks source link

Figure out how to integrate modules from different languages #100

Open Blaisorblade opened 10 years ago

Blaisorblade commented 10 years ago

@klauso wrote:

wouldn't it be nice to have a Racket-like #lang thing at the beginning of a file for the pts interpreter?

We've discussed this multiple times, but I couldn't find an issue for it.

Blaisorblade commented 10 years ago

@Toxaris added:

Yes. But the design is not obvious. What happens if a file imports a file with a different #lang? Maybe you can open an issue and we discuss it on github?

Related to this: Is there a subtyping relationship between PTS instances?

klauso commented 10 years ago

One obvious solution would be to only allow imports of files which use the same PTS instance.

I guess there is also a simple "subtyping" relationship, which is basically a subset relationship of the sorts , axioms, and relations (plus renaming).

Toxaris commented 10 years ago

Sorry, mistook the "close" for a "cancel" button. Maybe that aspect of the github issues interface is just wrong, I saw it happen to others.

klauso commented 10 years ago

I guess the "subtyping" proposal would allow modules written in a powerful type system to import modules written in a less powerful type system, but not vice versa.

Or could one also allow the other direction somehow?

Or does all this not work anyway, e.g., because bad things can happen if one passes a higher-order function (similar to the complications of blame assignment with higher-order contracts)?

Blaisorblade commented 10 years ago

I guess there is also a simple "subtyping" relationship, which is basically a subset relationship of the sorts , axioms, and relations (plus renaming).

I guess the "subtyping" proposal would allow modules written in a powerful type system to import modules written in a less powerful type system, but not vice versa.

That's also my guess.

|-_PTS1 t : T and PTS1 <= PTS2, then |-_PTS2 t : T — that is, PTS1 <= PTS2 means that the typing judgement in PTS2 is a conservative extension of the one in PTS1.

Proof: by structural induction on the typing derivation of |-_PTS1 t : T. Let PTS_1 = (S1, A1, R1) and PTS_2 = (S2, A2, R2) as usual. The only interesting cases are for axiom and Pi types, and the proof is similar. The rule checks that something belongs to A1 (resp. R1), but then it belongs to A2 (resp. R2), so |-_PTS2 t : T. All contained PTS1-derivations are PTS2-derivations by the inductive hypothesis. QED.

This should be even more obvious for (untyped) evaluation, since it's untyped.

If this holds, importing code in smaller instances is clearly sound — you can use the weaker system to typecheck them and use the results (types and values) in the bigger system.

One could also consider imports of bigger instances with less guarantees.

Given my experience with advanced Scala, I propose to be conservative and implement only something which is obviously correct.

The most conservative proposal would be to allow importing modules that have either no #lang or the same one as the file including them — this should also be easiest to implement (since one doesn't need to switch instance midway — though even that is probably easy-ish).


One problem with all of the above is that in PTS we often add at some level a constant called notExpressible, with a rule notExpressible : notExpressible, and that invalidates the results I mentioned above: in particular, eval_PTS2 t != eval_PTS1 t even if |- PTS1 t : T and PTS1 <= PTS2, because some occurrences of notExpressible must be replaced by another constant.

Ultimately, this is because our typed terms don't encode strictly PTS typing derivation, and try to ensure that each term has a type, therefore always producing cyclic derivations for constants. Instead, the original Axiom rule does not require sorts to be well-typed themselves.

We could change this: we could define TypedTerm = ... | TypedConst C C and take constants out of TermStructure. To reduce the impact on the codebase, we could still have accessors return notExpressible as long as we don't store it in the tree, and use GHC pattern synonyms to ease the transition (we could also use views and then update matchers with sed).

Toxaris commented 10 years ago

The most conservative proposal would be to allow importing modules that have either no #lang or the same one as the file including them.

I'm considering to do this. The subtyping stuff sounds cool, too, but maybe too complicated.

klauso commented 10 years ago

A crazy idea about PTS interoperability:

If we consider a setting with subtyping (and renaming) as described above this would enable some potentially cool things:

For instance, one could write a library in STLC and then import it into another module which uses the STLC on the type level.

If the language of a module supports a superset of STLC on multiple universes, one could even import the STLC module multiple times (one would have to specify in the import on which level it is supposed to be imported).

I'm not quite sure how this idea relates to universe polymorphism and cumulativity.

klauso commented 10 years ago

Another thought about the idea of a reverse subtyping relation: Let's say we have a STLC module importing a System F module. Would it be sound to allow the STLC module to use terms from the System F module that happen to have a STLC type? I wouldn't expect any type soundness problems since you can typecheck the composed program in the more powerful type system, but maybe adding additional typing rules can invalidate other invariants, such as parametricity?

Toxaris commented 10 years ago

Potential issues with importing from bigger PTS (that is, PTS with more sorts, rules or axioms):

  1. Maybe the smaller PTS is strongly normalizing and the bigger PTS isn't.
  2. Maybe we import from two different bigger PTS, so the result is well-typed in the union PTS, but maybe the union PTS is no longer functional. But our typechecking only works for functional PTS. (Maybe not a problem because each file is typechecked individually, but I don't know).
klauso commented 10 years ago

What do you mean by "not functional"?

Toxaris commented 10 years ago

See this comment about the restriction of our type checker to "full and functional" PTS. I think you put it there when you wrote the PTS type checker back in the summer of 2008.

klauso commented 10 years ago

Ah, now I remember vaguely. Functional PTS was something that implied uniqueness of types. Full PTS was something about making the typing rule for functions syntax-directed. I agree that PTS composition does not preserve these properties. I also agree that it is probably not problematic because we would type-check each file in "its" type system only.

klauso commented 10 years ago

I think your example with strong normalization is useful, though.

I wonder whether it also works the other way around: Is it possible that a smaller PTS is not strongly normalizing but a bigger one is? The answer is a clear "no" if PTS1 < PTS2 means that PTS1 accepts only a subset of the programs. Is PTS extension (adding axioms, relations etc.) monotonic with respect to the set of well-typed programs?

Blaisorblade commented 10 years ago

I wonder whether it also works the other way around: Is it possible that a smaller PTS is not strongly normalizing but a bigger one is? The answer is a clear "no" if PTS1 < PTS2 means that PTS1 accepts only a subset of the programs. Is PTS extension (adding axioms, relations etc.) monotonic with respect to the set of well-typed programs?

That's what I proved when I talked about "conservative extension" :-) I can explain better in person.

I also agree that it is probably not problematic because we would type-check each file in "its" type system only.

Does this preserve useful PTS properties, such as soundness, or uniqueness of types, or other ones used in the implementation? I think that's less than obvious. If the union of the PTSs is full and functional, instead, the composed program is automatically a program in the bigger PTS. So we could check that the union of all used PTSs satisfies these properties.

Blaisorblade commented 10 years ago

For instance, one could write a library in STLC and then import it into another module which uses the STLC on the type level.

If the language of a module supports a superset of STLC on multiple universes, one could even import the STLC module multiple times (one would have to specify in the import on which level it is supposed to be imported).

I think that sounds cool :-)

I'm not quite sure how this idea relates to universe polymorphism and cumulativity.

Universe polymorphism seems a restriction of this to the case where "renaming" is specialized to "add k to the constants" and all levels are the same.

TL; DR. Cumulativity is not really relevant, it addresses similar problems as universe polymorphism, but it's technically very different. It's just subtyping among universes (which destroys uniqueness of typing). AFAIK PTSs aren't cumulative per se, but I think you can expand a PTS to a "cumulative" one, which cannot be functional (because we don't get uniqueness of typing), with a size explosion. (E.g. for every axiom/rule with resulting sort _k, you add copies of the rule with resulting sort _j for all j > k, so that every value is also a type).

Blaisorblade commented 10 years ago

Another thought about the idea of a reverse subtyping relation: Let's say we have a STLC module importing a System F module. Would it be sound to allow the STLC module to use terms from the System F module that happen to have a STLC type?

That's even overly conservative. A scenario I and @Toxaris discussed is having DSLs based on smaller PTS but using code from a bigger PTS. Imagine STLC with polymorphic lists. You could define polymorphic lists in System F omega (with Church-encoding), and then you could use them in STLC. That just works because, luckily enough, the PTS instance is only used for typechecking Pi, not for application (in fact, not even for lambdas) — in this interpreter, it seems we can already add constants which aren't sorts, and this would work just as well.

I wouldn't expect any type soundness problems since you can typecheck the composed program in the more powerful type system, but maybe adding additional typing rules can invalidate other invariants, such as parametricity? [... See also discussion of strong normalization...]

Without looking at the code in the more powerful system, you only get the (possibly weaker) guarantees of the bigger system, in general. I imagine that if the code in the more powerful system is somehow well-behaved, then you can keep the guarantees, but I have no clear idea what "well-behaved" would mean (for instance, I suspect non-looping terms from the bigger system might not destruct strong normalization, but I'm not sure that's actually true).

Toxaris commented 10 years ago

I don't understand how people use "more powerful system" in this issue. Can we try to talk about "small system" and "big system", where smaller = less sorts, axioms and/or rules?

I fear a non-looping term from a not strongly normalizing language might still lead to nontermination at its use site.

Toxaris commented 10 years ago

Above, we seem to assume that the sub-PTS relationship is defined by taking the subset of the sorts, axioms and relations.

But I feel that sub-PTS should form a lattice with lambda star at the top and the empty PTS at the bottom. Rationale: If you can type a term in any PTS at all, you can type it in lambda-star by replacing all sorts with *. So lambda-star should be the top element of a lattice.

I don't know how to reconcile these two intuitions of the sub-PTS relationship.

Blaisorblade commented 10 years ago

I don't understand how people use "more powerful system" in this issue. Can we try to talk about "small system" and "big system", where smaller = less sorts, axioms and/or rules?

Yes. AFAICS, we used "more powerful" to mean bigger.

I fear a non-looping term from a not strongly normalizing language might still lead to nontermination at its use site.

Yes, that's what I fear, so figuring out general conditions is probably hard.

Above, we seem to assume that the sub-PTS relationship is defined by taking the subset of the sorts, axioms and relations.

Indeed, both I and Klaus are explicit on that. If you fix a PTS and consider its subsets, you have the product of three powerset lattices (is that a lattice)?

But I feel that sub-PTS should form a lattice with lambda star at the top and the empty PTS at the bottom. Rationale: If you can type a term in any PTS at all, you can type it in lambda-star by replacing all sorts with *. So lambda-star should be the top element of a lattice.

I don't know how to reconcile these two intuitions of the sub-PTS relationship.

The subset criterion is a syntactic approximation of "typeable in PTS1 is typeable in PTS2", which is a semantic criterion. Two semantically PTSs might be syntactically incomparable under the subset criterion, but if they're subset-comparable then they compare semantically the same way. So it seems that the latter criterion is just a bigger relation — which seems hard to reason about though (a bit like showing observational equivalence is harder than showing beta-equivalence). Under the subset criterion, lambda-star is incomparable to any PTS where * has a non-* type. See (NOTE).

Syntactically speaking, I think lambda* can be obtained by merging sorts (and updating rules correspondingly) from any other PTS — and this preserves typeability. Allowing also for merging allows to compare syntactically more PTS pairs, but I'm not sure this is a complete criterion or we have still a approximation. (Say, if you have any PTS with star and box, and merge them by replacing box with star, you get lambda*).

The shape of this "semantic" lattice is also (mathematically) interesting, because it seems you can go an infinite number of steps in any direction before getting to lambda* (for instance, consider System F, F_2, F_3... F_omega — right? If F_2 is a PTS). I have no idea if there's something immediately below lambda-star. Maybe you can go down from lambda* by moving the circular rule to higher-level? So Fomega* is above the infinite progression of F_i systems, but below lambda*? (So it seems this sequence probably gives rise to some crazy ordinal number — we are already at omega+omega; in fact, I imagine we can probably add kind-polymorphism to f-omega and go on another infinite progression).

(NOTE) I guess that means "all PTS but the empty one, and maybe a small, finite number of exceptions, and (possibly infinite) crazy PTSs where * : * and you have more useless rules".

klauso commented 10 years ago

If you can type a term in any PTS at all, you can type it in lambda-star by replacing all sorts with *.

Really? That's not obvious to me. Do you have a reference?

I think this is not true in general. For instance, a PTS may have additional axioms which lambda-star does not have. Maybe it would work if one would add the axiom |- c : * for every axiom |- c : s in the other PTS.

Blaisorblade commented 10 years ago

I think this is not true in general. For instance, a PTS may have additional axioms which lambda-star does not have. Maybe it would work if one would add the axiom |- c : * for every axiom |- c : s in the other PTS.

I think the idea only makes sense (as is) for systems where all constants are sorts. Otherwise you'd need those rules, yes.

I think you can prove the result similarly to what I did above: this proof will requires care but should not be long. You need to translate terms to replace sorts with *, and you need to show that after translating a typing derivation, all sorts become *. The less obvious part is the application rule.

I've skimmed "Lambda calculi with types", and it only seems to mention that lambda* is a collapsed version of lambdaC (the calculus of constructions I guess).

Full PTS was something about making the typing rule for functions syntax-directed.

"Lambda calculi with types" says that "full" means "having relations across all sorts" and says that lambda2 (System F) is not full. So I don't believe the restriction to full PTSs is still there.

Blaisorblade commented 10 years ago

So I don't believe the restriction to full PTSs is still there.

OK, the referenced paper talks about semi-full PTSs.

Toxaris commented 10 years ago

I wrote:

If you can type a term in any PTS at all, you can type it in lambda-star by replacing all sorts with *.

@klauso asked:

Really? That's not obvious to me. Do you have a reference?

No reference, I picked this up in some discussion at a conference. You and @Blaisorblade seem to correct in that it only works if there are no non-sort constants, or that one has to add c : * axioms for all constants non-sort constants c.

More formally, I would phrase it like this: Given two sets of constants C1 and C2, corresponding PTS instances (S1, A1, R1) and (S2, A2, R2), a function f : C1 -> C2 so that

  1. f c in S2 for all c in S1,
  2. (f c1, f c2) in A2 for all (c1, c2) in A1 and
  3. (f c1, f c2, f c3) in R2 for all (c1, c2, c3) in R1,

two terms A and B and a context Γ with Γ ⊢ A : B, then f Γ ⊢ f A : f B where f is extended to map structurally over contexts and terms.

In other words, if f preserves the structure of PTS instances, it also preserves typeability.

Toxaris commented 10 years ago

Actually, given any PTS, the function f _ = * is a structure-preserving map from the constants of that PTS to λ, so λ is the top element of the order induced by the existence of structure-preserving maps. We just have to translate all constants (including non-sort constants) to *.

klauso commented 10 years ago

It still sounds strange that there is a "maximal" type system within PTS. I wonder whether lambda-* rejects enough programs to be still useful, then.

Another "maximal" type system in some sense is a lambda calculus with recursive types. Any untyped LC term can be typed by typing everything as mu T. T -> T. But, arguably, too many programs are accepted this way. For instance, with this approach one cannot Church-encode numbers and type-check that a function will, say, return a number.

Does Lambda-* suffer from a similar problem? For instance, if one would define a type for Church numerals, would some terms that are not supposed to type-check as Church-numerals still have that type?

Toxaris commented 10 years ago

All I know is: λ is type-safe and λ admits non-normalising terms.

What I guess: λ* does not admit typed self-representation.

Blaisorblade commented 10 years ago

For instance, with this approach one cannot Church-encode numbers and type-check that a function will, say, return a number.

I think the only problem you get is nontermination. The type of Church-numerals still exists — you get nonterminating programs, but you keep type soundness. You have mu T. T -> T as alternative, but if you use that type for a Church-numeral, I think it's your choice. Especially, I'm even less sure how that would translate to lambda*.

Does Lambda-* suffer from a similar problem? For instance, if one would define a type for Church numerals, would some terms that are not supposed to type-check as Church-numerals still have that type?

Since lambda* is type-safe, you will never get a type error. OTOH, if you could inhabit bottom (which is not known to this day, AFAIK — only looping combinators are known, not fixpoint ones), you could use it to write nonTerminatingCast : a -> b and use it inside a term of type Church-numeral, so this type (any type) is enlarged by exotic, nonterminating terms. Still, not sure that's a problem.

Toxaris commented 10 years ago

We can have nonterminating terms without having a fixed point combinators. For example, we can use a looping combinator to implement the nonterminating term :)

The difference between looping combinators and fixed point combinators is the type of the recursive calls, not the dynamic behavior.

Blaisorblade commented 10 years ago

We can have nonterminating terms without having a fixed point combinator. For example, we can use a looping combinator to implement the nonterminating term :)

Sure, but with arbitrary types? In this case yes: "Type" is not a type, and http://www.cs.cmu.edu/~kw/scans/hurkens95tlca.pdf, explain that Girard's paradox yields a term of type "forall T : *. T", that is, _|_. So yes, we can in particular implement a non-terminating cast.

(I thought again about PLDI'09's footnote, but the problem is that we'd need a terminating general recursive type-level program to write a non-terminating term in it; so we could have bottom without having non-terminating terms in Fomega*).

Toxaris commented 10 years ago

I'm a bit lost. What footnote? At least in a call-by-value interpretation, if we have non-terminating terms at all, we can smuggle them into terms of arbitrary types by beta-expansion.

Blaisorblade commented 10 years ago

I'm a bit lost. What footnote? What footnote? What footnote??? It's "the footnote" from footnote.txt :laughing: :

https://github.com/Toxaris/typed-self-representation/blob/master/journal-version/footnote.txt#L6

At least in a call-by-value interpretation, if we have non-terminating terms at all, we can smuggle them into terms of arbitrary types by beta-expansion.

As far as we know, in Fomega* we only have non-terminating types, so it's an open problem whether there are non-terminating terms.

To quote the original footnote:

"It would be interesting, though, to analyze whether the inconsistency is confined to the kind system or whether it leaks into the type system".

Toxaris commented 10 years ago

I don't really understand how the question of strong normalization in fomegastar applies to this issue.

klauso commented 10 years ago

+1

Blaisorblade commented 10 years ago

I don't really understand how the question of strong normalization in fomegastar applies to this issue.

No it doesn't: it's just that we learned about looping combinators in that context.

Blaisorblade commented 10 years ago

One problem with all of the above is that in PTS we often add at some level a constant called notExpressible, with a rule notExpressible : notExpressible, and that invalidates the results I mentioned above: in particular, eval_PTS2 t != eval_PTS1 t even if |- PTS1 t : T and PTS1 <= PTS2, because some occurrences of notExpressible must be replaced by another constant.

Ultimately, this is because our typed terms don't encode strictly PTS typing derivation, and try to ensure that each term has a type, therefore always producing cyclic derivations for constants. Instead, the original Axiom rule does not require sorts to be well-typed themselves.

@Toxaris fixed the above in 3d3be1db5fcdee2171466808c38df524fbf1287c :-)

That fix should make the following idea easier to try out (I expect the change should be easy and small, but should be experimental).

importing code [from] smaller instances is clearly sound — you can use the weaker system to typecheck them and use the results (types and values) in the bigger system.

Blaisorblade commented 10 years ago

One obvious solution would be to only allow imports of files which use the same PTS instance.

I hope we could implement this meanwhile, to avoid hacks like ce2555670c79a74b512f9f837ea934c4b3a21486 (which is the best we have).

Toxaris commented 10 years ago

I would be happy to have some form of optional language foo; statement and checking that all imported modules use the same language as the importing module, but this has a low priority for me. You want to implement this, @Blaisorblade?

Blaisorblade commented 10 years ago

I've split out issue #111 for the language foo statement, so that we can think of the language foo; statement and PTS subtyping separately.

Meanwhile, I've realized an implementation issue for PTS subtyping. (Implementing that looked like more fun, though one might want a separate option for it).

I planned to convert instances to be data structures to allow checking subsetting (because, well, tha seems the easiest thing), but then I ran into infinite PTSs (fomegaomega).

Can we implement subtyping for such infinite PTSs? I guess not directly — given their descriptions, we can prove or disprove the subset relation, but not more.

Arguably we should have also an infinite PTS for calculi with infinite universes and universe polymorphism, but it's not there yet. So for now I could just implement a check whether a finite PTS is a subset of an infinite one.

Blaisorblade commented 10 years ago

So for now I could just implement a check whether a finite PTS is a subset of an infinite one.

Done. More precisely, I can check whether a finite PTS is a subset of another one in the obvious way.