cs3110 / textbook

The CS 3110 Textbook, "OCaml Programming: Correct + Efficient + Beautiful"
Other
740 stars 134 forks source link

Correct the dynamic semantics of type annotations #117

Closed favonia closed 1 year ago

favonia commented 1 year ago

I want to thank you all for this excellent textbook. I am using this in my course now. However, I think the dynamic semantics of the type annotations is a bit off, and I hope this pull request can help improve the textbook from which I benefit greatly. Please consider this as my way of saying thanks.

The explanation is a bit off for two reasons: (1) there is still computation going on even if all annotations are removed, and (2) the dynamic semantics is a specification for the source language, not the target language, and type annotations are part of the syntax of the source language; the current wording would make sense only after specifying a target language without type annotations, but I believe we do not want to do that.

clarksmr commented 1 year ago

Hm, your revision says "If e evaluates to v, then (e : t) evaluates to v." That would mean, e.g., that (5 : bool) evaluates to 5. But it does not.

favonia commented 1 year ago

Hm, your revision says "If e evaluates to v, then (e : t) evaluates to v." That would mean, e.g., that (5 : bool) evaluates to 5. But it does not.

The static semantics ("If e has type t then (e : t) has type t.") will prevent (5 : bool) from having a type. Otherwise, the ill-typed expression if true then 5 else () will also run according to the dynamic semantics ("If e1 evaluates to true, and if e2 evaluates to a value v, then if e1 then e2 else e3 evaluates to v"). Such ill-typed expressions are prevented by the static semantics. Maybe there's some convention of the textbook that I was not aware of?

favonia commented 1 year ago

The main issue of the current wording is that it assumes there's an unspecified compilation process that turns (1+1 : int) into some sort of "untyped OCaml" 1+1 before running it. However, we can also say (1+1 : int) directly evaluates to 2 at the source level without referring to compilation, which is basically this PR.

favonia commented 1 year ago

PS: Standard ML 97 used "Reduced Syntax" in Section 6.1 of the Definition to specify such a compilation process. In this case, the dynamic semantics is explicitly defined for the "Reduced Syntax" instead of the original "Core Syntax." I felt that the textbook did not intend to apply similar tricks to reduce the number of formal inference rules (because it barely saves anything in the informal writing, and the actual compilation will do many more transformations). Still, maybe I am wrong about the intention of the textbook, and if so, I apologize. Anyway, in the case of Standard ML 97, (1+1 : int) in the Core Syntax is first transformed into 1+1 in the Reduced Syntax, and then the dynamic semantics for the Reduced Syntax evaluates it to 2. It means (1+1 : int) in the Core Syntax does have dynamic semantics that reduces it to 2.

clarksmr commented 1 year ago

The if example you give is not the best analogy. (Yours uses an if expression that can be evaluated to a value even if it is ill typed, and the rules in the book do so successfully.) A better analogy would be: if 5 then 1 else 2. That is also an ill-typed expression, but it cannot be evaluated to a value using the if dynamic semantics, because the guard is neither true nor false.

Likewise (e : t) is not something I directly give a dynamic semantics in this book. Rather, I treat it as a compile-time check, and at run-time (if it ever gets there) it behaves as e. Indeed to formally model that one could take the approach of The Definition of Standard ML. But consider the audience of my book, especially this early in it. I cannot assume they know so much about source languages, target languages, translation of surface syntax to core syntax, etc. So much formalism would be counter productive.

Btw, another (worse) alternative would be to say (e : t) evaluates to v if e evaluates to v and e : t. But that conflates compile-time with run-time.

As a solution, what I can do is explain a little bit more of this. I'll push something later that does that.

favonia commented 1 year ago

@clarksmr Thanks for your response. I feel there are two points that are worth further discussions:

  1. Hypothetically, if we had the dynamic semantics saying (5 : bool) will be evaluated to 5, what would be the issue? Previously, you mentioned that (5 : bool) does not evaluate to 5, but that is not a run-time restriction. The OCaml runtime does not check whether 5 is bool and can happily reduce (5 : bool) to 5 if we skip the type-checking, contrary to your claim. Perhaps I missed something?

  2. To clarify, I never meant that we should follow the Definition of Standard ML. Instead, my point is that the unspecified compilation process the textbook is alluding to is closer to the Definition that I believe you want to avoid. It is incorrect to associate my PR with the Definition. My PR is to avoid the Definition. We can also just ignore the Definition if that's causing confusion.

Again, thanks for the discussion, and I'm curious about what your solution is.