Open RobertHarper opened 8 years ago
but, can general values?
I don't think so. The ability to define let rec xs = 1 :: xs
in OCaml annoys me to no end. The guarantee that strict lists, trees, etc. are finite is very important to convince myself that recursive operations on them will terminate.
Out of curiosity, what is a need cell? Neither Google nor Google Scholar is being particularly helpful. Maybe it is related to the following use case.
I occasionally want the ability to initialize a reference cell so that it points to itself. For example, I think the most convenient implementation of the union-find data structure is to make non-unified cells point to themselves:
datatype 'a state = Cell of 'a cell | Value of 'a
withtype 'a cell = 'a state ref
fun fresh () = let val rec r = ref (Cell r) in r end
But this is not allowed by Standard ML today. And, as far as I can tell, this cannot work as long as ref
is an ordinary function.
I think it was described in #12 as a memoizing (i.e., call-by-need) construct.
You want to use explicit backpatching, the reference cell is there for that purpose. The val rec bindings in SML are misleading, because the only recursive things are functions, whereas in Haskell it "works" because every damn thing is a reference cell, and it does the backpatching for you. Simply allocate the cell in Cell state, and assign that cell to its contents.
Bob Harper
(c) Robert Harper. All rights reserved.
On Wed, Jan 2, 2019 at 3:09 PM eduardoleon notifications@github.com wrote:
Out of curiosity, what is a need cell? Neither Google nor Google Scholar is being particularly helpful. Maybe it is related to the following use case.
I occasionally want the ability to initialize a reference cell so that it points to itself. For example, I think the most convenient implementation of the union-find data structure is to make non-unified cells point to themselves:
datatype 'a state = Cell of 'a cell | Value of 'awithtype 'a cell = 'a state ref fun fresh () = let val rec r = ref (Cell r) in r end
But this is not allowed by Standard ML today. And, as far as I can tell, this cannot work as long as ref is an ordinary function.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/SMLFamily/Successor-ML/issues/13#issuecomment-450970355, or mute the thread https://github.com/notifications/unsubscribe-auth/ABdsdQrhwk075inAxZe5LPSfmflkDLjGks5u_RGQgaJpZM4H_WGo .
The problem is that there is no good value to initialize the reference cell with:
fun fresh () =
let val r = ref (Cell ???)
in r := Cell r; r end
In my admittedly limited understanding, the entire point to disallowing definitions like
val rec xs = 1 :: xs
is to make structural induction a valid reasoning principle for structurally recursive functions. However, even in today's Standard ML, this already fails when the datatype
recursion goes through reference cells, because mutation allows us to “tie the knot”:
infixr 5 :::
datatype 'a seq = NIL | ::: of 'a * 'a seq ref
fun loop NIL = ()
| loop (x ::: xs) = (print x; loop (!xs))
let
val knot = ref NIL
val seq = "hello" ::: knot
in
knot := seq;
loop seq
end
Hence, as far as I can tell, we do not lose anything by allowing cells to be initialized with references to themselves.
The fact is in such situations you have to have a "null" value that is only ever used to initialize a cell, and is then immediately obliterated on backpatching. See PFPL for a discussion. There's no way around this.
(c) Robert Harper. All rights reserved.
On Wed, Jan 2, 2019 at 4:14 PM eduardoleon notifications@github.com wrote:
The problem is that there is no good value to initialize the reference cell with:
fun fresh () =
let val r = ref (Cell ???)
in r := Cell r; r end
In my admittedly limited understanding, the entire point to disallowing definitions like
val rec xs = 1 :: xs
is to make structural induction a valid reasoning principle for structurally recursive functions. However, even in today's Standard ML, this already fails when the datatype recursion goes through reference cells, because mutation allows us to “tie the knot”:
infixr 5 ::: datatype 'a seq = NIL | ::: of 'a * 'a seq ref
fun loop NIL = ()
| loop (x ::: xs) = (print x; loop (!xs))
let
val knot = ref NIL
val seq = "hello" ::: knot in
knot := seq;
loop seq end
Hence, as far as I can tell, we do not lose anything by allowing cells to be initialized with references to themselves.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/SMLFamily/Successor-ML/issues/13#issuecomment-450987021, or mute the thread https://github.com/notifications/unsubscribe-auth/ABdsdfaVYZ8j-xAs_qH6D3_ZHoWTQkfIks5u_SErgaJpZM4H_WGo .
I'm not seeing why it has to be this way. OCaml is by no means a lazy language, yet it allows
type 'a state = Cell of 'a cell | Value of 'a
and 'a cell = 'a state ref
let rec r = ref (Cell r)
Of course, unfortunately it also allows
let rec xs = 1 :: xs
But I'm not seeing why we couldn't aim to allow the former while still forbidding the latter.
Adding a “null” value to the state
type is inconvenient. One (crude, but mostly automatic, hence very useful) way in which I check that my code covers all possible cases is to
These things fly out of the window if we add spurious cases to datatype
s.
ocaml screws this all up, but it has an implicit null value. it doesn't make sense in a by-value language to admit general recursion, but they try.
(c) Robert Harper. All rights reserved.
On Wed, Jan 2, 2019 at 4:51 PM eduardoleon notifications@github.com wrote:
I'm not seeing why it has to be this way. OCaml is by no means a lazy language, yet it allows
type 'a state = Cell of 'a cell | Value of 'a and 'a cell = 'a state ref
let rec r = ref (Cell r)
Of course, unfortunately it also allows
let rec xs = 1 :: xs
But I'm not seeing why we couldn't aim to allow the former while still forbidding the latter.
Adding a “null” value to the state type is inconvenient. One (crude, but mostly automatic, hence very useful) way in which I check that my code covers all possible cases is to
- Ensure all matches are exhaustive.
- Ensure all control flow paths are reachable.
- Ensure no exceptions are raised outside of code that does I/O.
These things fly out of the window if we add spurious cases to datatypes.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/SMLFamily/Successor-ML/issues/13#issuecomment-450996485, or mute the thread https://github.com/notifications/unsubscribe-auth/ABdsdf73uome2AfH3EHEGObX79U89dnvks5u_Sn1gaJpZM4H_WGo .
there's no way to avoid it afaik. ocaml puts it under the hood; you can get a run-time error amounting a match exception if the recursive definition requires the value fo the recursive definition. sml simply exposes what is true, and puts it on you to manage that, rather than pretending it's otherwise and hiding the specifics from you. i much prefer the "honest" route.
bob
(c) Robert Harper. All rights reserved.
On Wed, Jan 2, 2019 at 4:55 PM Robert Harper rwh@andrew.cmu.edu wrote:
ocaml screws this all up, but it has an implicit null value. it doesn't make sense in a by-value language to admit general recursion, but they try.
(c) Robert Harper. All rights reserved.
On Wed, Jan 2, 2019 at 4:51 PM eduardoleon notifications@github.com wrote:
I'm not seeing why it has to be this way. OCaml is by no means a lazy language, yet it allows
type 'a state = Cell of 'a cell | Value of 'a and 'a cell = 'a state ref
let rec r = ref (Cell r)
Of course, unfortunately it also allows
let rec xs = 1 :: xs
But I'm not seeing why we couldn't aim to allow the former while still forbidding the latter.
Adding a “null” value to the state type is inconvenient. One (crude, but mostly automatic, hence very useful) way in which I check that my code covers all possible cases is to
- Ensure all matches are exhaustive.
- Ensure all control flow paths are reachable.
- Ensure no exceptions are raised outside of code that does I/O.
These things fly out of the window if we add spurious cases to datatypes.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/SMLFamily/Successor-ML/issues/13#issuecomment-450996485, or mute the thread https://github.com/notifications/unsubscribe-auth/ABdsdf73uome2AfH3EHEGObX79U89dnvks5u_Sn1gaJpZM4H_WGo .
you can get a run-time error amounting a match exception if the recursive definition requires the value fo the recursive definition.
That's true for lazy values in OCaml (and for recursive modules, but that's a different story).
However, for OCaml definitions that don't involve laziness the safety of recursive definitions is statically checked by what amounts to a type system, and definitions that pass the check never raise exceptions at runtime.
I would like to add to @yallop's comment that this section of the OCaml manual explains how a recursively defined variable may appear in the right-hand side of its own definition. I found this via this Stack Overflow question, by the way.
Thanks! I'm not up to date, thanks for the summary. I wonder how strict the restrictions are? And, if backpatching is used as I think it must, then why introduce this overhead just for the sake of admitting more general recursion? It would seem to me that in a cbv language you want "recursive things" such as recursive functions or recursive suspensions or recursive tuples-of-functions (sometimes called objects).
Bob
(c) Robert Harper. All rights reserved.
On Fri, Jan 4, 2019 at 6:12 PM yallop notifications@github.com wrote:
you can get a run-time error amounting a match exception if the recursive definition requires the value fo the recursive definition.
That's true for lazy values in OCaml (and for recursive modules, but that's a different story).
However, for OCaml definitions that don't involve laziness the safety of recursive definitions is statically checked by what amounts to a type system, and definitions that pass the check never raise exceptions at runtime.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/SMLFamily/Successor-ML/issues/13#issuecomment-451596705, or mute the thread https://github.com/notifications/unsubscribe-auth/ABdsddaRyqMM0GKv8WTlGNPeK4MZ7d9vks5u_9_QgaJpZM4H_WGo .
You're welcome! OCaml does indeed use a kind of backpatching, but it's a bit more efficient than the naive approach, which involves an extra indirection whenever a recursively-defined value is accessed. The basic idea is to allocate a dummy block for each recursively-defined value, then overwrite the blocks directly during initialization. (The details are in "Compilation of extended recursion in call-by-value functional languages by Hirschowitz, Leroy & Wells.)
There are two restrictions:
let rec x = (e1, e2)
is acceptable because it allocates a block of statically-known size 2, but let rec x = if e then e1 else e2
is not, because the size of the value depends on e
let rec x = ref (C x)
is ok, but let rec x = ref (C !x)
is not.The second restriction is the one that's based on a type-system-like check, described in a draft paper. (The manual @eduardoleon linked is a bit out of date, unfortunately.)
The restrictions seem flexible enough to be useful in practice -- for example they allow @eduardoleon's example above, and some other real-world examples involving mutual recursion between functions and records.
My understanding is that, in OCaml, the restrictions are meant to be just strong enough that recursive definitions won't compromise type safety in the progress-and-preservation sense. From a performance standpoint, the overhead of let rec
is only present if you explicitly request it, and let
has no such overhead, which I think is a reasonable state of affairs.
In a call-by-value language, we expect positive types, i.e. sums and eager tuples, to satisfy suitable induction principles. But, no matter what artificial restrictions we impose on the construction of reference cells (the only natural one is “don't dereference it until it has already been constructed”), we are not going to recover any similar induction principle for them. So we might as well allow self-referential cell construction, which causes no harm.
@yallop Thanks for the reference to the paper.
I like the improvement to the rec syntax, which was wildly out of control in Standard ML. However, let me lay down a marker saying that it may have to be revised again depending on what we do about my suggestions regarding "val", "def", "need", and "fun" declarations. Where "rec" is permitted should be tied to the semantics of the thing. Functions can be recursive, need cells can be recursive, but can general values?