Open lpw25 opened 5 years ago
Why not put them directly in the corresponding module, instead of inside a submodule?
As I see it, a submodule is useful when the module in question is typically opened to avoid bringing these operators into the global namespace, but my impression is that one won't typically open these modules.
As I see it, a submodule is useful when the module in question is typically opened to avoid bringing these operators into the global namespace, but my impression is that one won't typically open these modules.
Submodules are also useful for the reverse situation -- where the submodule is always opened but you don't want to bring in everything in the parent. That is the case here. I literally never want to open
List but I need to use open
to use these operators.
Good point, fair enough.
@lpw25 In that case, you could also include the Syntax
module in the main one.
Module that contains operators are called Infix
in batteries/containers, but it seems we're going to need a new convention...
I'm open to other suggestions for the name
Syntax
.
I think it would be nice to have a name that can be used in general for bringing domain specific operators like specialized comparisons, arithmetic ops, monadic operators and other infix operators into the toplevel scope. I have the impression that you don't want to distinguish between these usages in practice (it would become a bit too bureaucratic) so having a single good name for all of these would be nice.
I'm not saying Syntax
is necessarily a bad name but OTOH no new syntax is being defined. I would lean towards Ops
which is short for the M.()
notation and may prevent a few opens. Other than that with tongue in cheek I'd propose Open
-- the module you are allowed to open.
I think it would be nice to have a name that can be used in general for bringing domain specific operators like specialized comparisons, arithmetic ops, monadic operators and other infix operators into the toplevel scope.
I'm not so sure about this. The let
operators and any similar operators like match
are of a very specific form. By opening one of these modules you are essentially choosing the kind of underlying computation for this piece of code. I think I prefer to have a name that means providing this specific kind of operation rather than a general provision of operators.
To put this another way, if someone wants to use some let
operators for something other than monadic/applicative operations -- nothing immediately springs to mind but I'm sure there are plenty of things people might try -- then I would strongly prefer them not to put them in a Syntax
module, since in their case let open Foo.Syntax in
has a different underlying meaning.
I'm not sure its completely related, but I also like the parallel between this Bar.Syntax ( expr )
form and the computation expression form async { expr }
used in F#. It feels like if you just put any old operators in Syntax
then you are losing that parallel in some way.
All this is not to say that I'm against having a standardised name for modules containing all the operators associated with a type (e.g. Ops
or Infix
), I would just like to separately have a module containing only the "syntactic operators".
To put this another way, if someone wants to use some
let
operators for something other than monadic/applicative operations -- nothing immediately springs to mind but I'm sure there are plenty of things people might try -- then I would strongly prefer them not to put them in aSyntax
module, since in their caselet open Foo.Syntax in
has a different underlying meaning.
I can see your point. However with the current proposal you are attributing a (monadic) meaning to the name Syntax
which is a bit odd.
However with the current proposal you are attributing a (monadic) meaning to the name Syntax which is a bit odd.
True. Maybe a name based on "computation" is better as that is the most commonly used word to cover both monadic and applicative.
True. Maybe a name based on "computation" is better as that is the most commonly used word to cover both monadic and applicative.
In order to nurture a bit of confusion I propose either Seq
or Haskell
.
I like Syntax
as a convention. It's the module that rebinds the rebindable syntax, after all.
An alternative is to have sub-modules named after the meaning of the exported syntax, like List.Monad
and Cmdliner.Applicative
. But this strikes me as mouthful and unnecessary. Plus, we have an entire family of rebindable keywords now, and people will use them in various ways, so we end up with M1.Applicative.(let*)
and M2.Monad.(and!)
.
To me, Syntax
says "I'm changing the meaning of your syntax, go read the docs". And I like that.
To me,
Syntax
says "I'm changing the meaning of your syntax, go read the docs". And I like that.
But then why not put other operators in such a module ? Something @lpw25 finds undesirable (not sure I'm totally convinced by his point though).
Even if this adds some more names for existing functions, I think it would be good to expose return
and bind
(and perhaps product
) in the parent modules. This is to support writing in explicitly monadic style, but without forcing to use the new let-syntax or at least the need to use open
(e.g. let (let*) = List.bind in let* x = ... in ... .. List.return y
).
Only exploring the design space: assuming we have directly List.bind
and List.return
, would it make sense to tell users to open Monad(List)
now that this is possible? (Monad
would just be a functor renaming some identifiers). Performance would not be great without a good inliner, though.
I don't feel fully comfortable with providing features that more or less force people to use open
. The next PR that will add any identifier to one of those *.Syntax
modules could easily break code because of shadowing. For instance, imagine this PR only proposed the let*
form and not let+
; people could start using let+
locally for their own purpose, including in contexts that also use the let*
from List
, and this would break if let+
were added later to List
. This would suggest creating smaller sub-modules, which should really not be extended later.
For instance, imagine this PR only proposed the let form and not let+; people could start using let+ locally for their own purpose, including in contexts that also use the let from List, and this would break if let+ were added later to List.
Seems like these people would just be asking for trouble. Also, it would be very easy for them to fix their code in such a case.
Even if this adds some more names for existing functions, I think it would be good to expose return and bind (and perhaps product) in the parent modules. This is to support writing in explicitly monadic style, but without forcing to use the new let-syntax or at least the need to use open (e.g. let (let) = List.bind in let x = ... in ... .. List.return y).
I strongly agree with this. Being able to quickly and explicitly bring one or two of these let/match/etc operators into scope would be very helpful for small scale readability. I've found that approach somewhat more readable for new and experienced OCamlers users with the existing monadic syntax ppxs.
Seems like these people would just be asking for trouble.
I don't see why. Are you saying that the code will never mix different custom let binders? What about other identifiers? There is already return
; could more identifiers be added later? If so, I suppose they will also have rather generic names, likely to clash with user identifiers.
Also, it would be very easy for them to fix their code in such a case.
I find this is a rather weak argument. Things are changing a bit, but we still consider maintaining backward compatibility of the core distribution an important feature, especially when breaking changes cannot be announced in advance with deprecation markers. Even if fixing the code base is easy enough, this requires intervention from the package author and from OPAM package maintainers (patching the code, creating a new release, making sure the previous one is marked as incompatible with the new OCaml, etc). If that libraries happened to be used by other OPAM packages, we can end up with breaking a good part of the eco-system for some time simply because we add more identifiers.
Are you saying that the code will never mix different custom let binders?
No, I'm just saying that I think people should explicitly bind their binders when they do that. But that's like, my opinion.
As for the "weak argument", it doesn't seem like much of an argument indeed. It just reflects the fact that I don't mind breaking code that shouldn't have been written in the first place.
But thinking more about it, I'm not sure that discussed style is as bad as I originally thought, and clearly: I have no idea what style people will adopt. So please, disregard my intervention in this discussion :)
Could I suggest not getting too bogged down in discussions on what we would do in the hypothetical situation that we wish to extend the signature of these modules. Yes we should probably be more careful in that situation than we would with other modules, but no its not a disaster if we absolutely had to add more operators since in practice it would probably break approximately no code and would give a nice warning about shadowing for cases where it did. The only cases where I could see us adding more values to these modules is for things like a future (match*)
operator, in which case we obviously don't shadow anything in existing code.
Would people prefer it if I moved the return
functions out of the Syntax
modules? That way it would only be operators inside of them, and we would only be encouraging people to use open
where it was strictly necessary.
I think it would be good to expose return and bind (and perhaps product) in the parent modules.
Do you mean e.g. having Seq.bind
even though Seq.flat_map
exists? I'm happy to do so, but I'd like there to be some consensus (or an executive decision) on how we want these things handled before I make the change.
Only exploring the design space: assuming we have directly List.bind and List.return, would it make sense to tell users to open Monad(List) now that this is possible?
I'm not completely against that approach, but there are some issues:
Monad.Syntax(List).( expr )
as that syntax is not currently supported (and may be ambiguous). You would have to use let open Monad.Syntax(List) in ...
instead, which is not as nice.It would also probably require having both of:
Monad.Syntax:
functor (X : sig type 'a t val bind : ... val product : ... val map : .... end) ->
sig val ( let* ) : ... val ( and* ) : ... val ( let+ ) : ... val ( and+ ) : ... end
Applicative.Syntax:
functor (X : sig type 'a t val product : ... val map : .... end) ->
sig val ( let+ ) : ... val ( and+ ) : ... end
Would people prefer it if I moved the
return
functions out of theSyntax
modules?
I think it wouldn't do any harm (it can be added later if it is perceived as missing). I think I would certainly write Ok v
rather than return v
if I'd use the Result
monad. It also give a better local hint which let*
are being used assuming someone open
ed the syntax at the toplevel.
The submodule discussion is very fun but I think there are other things to discuss as well, among which:
If a module has several distinct let-operator structures of interest¹, what's in the Syntax module? All of them? This has already been asked in https://github.com/ocaml/ocaml/pull/2170#issuecomment-442451338 and https://github.com/ocaml/ocaml/pull/2170#issuecomment-442453798.
Is there an implicit decision that *
is for monadic operators, while +
(or something else?) should be used for Applicative structure? Or is the the idea that +
be used with map
operators, so for the Functor structure? Is there consensus for that choice?
(To me having +
for map and *
for bind makes some sense, as bind is in some sense a "stronger" version of mapping with a value-dependent result structure/shape/effect.)
¹: I don't know offhand of modules with several distinct Monad structures of interest (I'm sure they exist, and probably examples can be built by lifting the case of numbers with several distinct Monoid structure), but all sequential collections (List, Seq, etc.) have a zipWith-style applicative structure that is distinct from their cartesian-product-esque monad.
My personal feeling is that we are moving a little too fast in solidifying conventions for rebindable-syntax lets. For extensions and attributes, @alainfrisch did a lot of work on case studies, presenting and discussing several use-cases for the feature, and evaluating each of the proposed syntaxes/designs on them. Where are those examples with the proposed operators here? In the #1947 discussion there were some experience reports from ppx_let
, but the syntax used there is not exactly the same, and some of the questions need to be handled differently.
This is not directly related to #2170 itself, but personally I would like to see idiomatic real-world code that uses those new binding operators (we should also agree on a name for the feature, I would suggest "binding operator" rather than "let operator" given that and
is also involved) with, let's say, list
and option
as a monad, and zipList
and cmdLiner (or a parser, etc.) as applicatives.
In particular, I have the vague intuition that something is wrong with having (and* )
and (and+)
defined to be the same thing in all cases. Shouldn't (and+)
be more like List.combine
than a cartesian product? I can't tell if this is a real issue or just sleep-deprivation speaking; if we had robust examples exercising these features, and not just abstract code with rewriting, we could easily tell for sure.
If a module has several distinct let-operator structures of interest¹, what's in the Syntax module? All of them?
I would expect individual Syntax
modules for individual notions of computation. So for example, I would expect Seq.Syntax
to be the normal monadic operations for Seq.t
, with a separate Seq.Zip.Syntax
for the zipping applicative form if people wanted to expose that.
Is there an implicit decision that * is for monadic operators, while + (or something else?) should be used for Applicative structure? Or is the the idea that + be used with map operators, so for the Functor structure? Is there consensus for that choice?
I've definitely been playing a bit fast and loose with the subtle difference between these two approaches -- refering to let*
as monadic and let+
as applicative -- but really what I've implemented here is distinguishing between *
for binding and +
for mapping. So it would indeed be reasonable for a functor that was not an applicative to implement let+
but not and+
.
This is also why I've put in both and+
and and*
one is for mapping over a product and the other is for binding to a product. Of course, how you make a product doesn't depend on whether you are going to map over it or bind to it, so they are necessarily the same operation. I also think it looks slightly nicer to always use the same operator symbol for the let
and the and
.
If we instead said that *
was for monadic and +
was for applicative then I would not expect to have an and*
operator. I would also not expect to see a functor implement let+
but not and+
.
we should also agree on a name for the feature, I would suggest "binding operator" rather than "let operator" given that and is also involved
Sounds good.
In particular, I have the vague intuition that something is wrong with having (and* ) and (and+) defined to be the same thing in all cases. Shouldn't (and+) be more like List.combine than a cartesian product?
No. List.combine
is the product for a ZipList style applicative. That applicative has no associated monad and also requires having a return
that creates an infinite list instead of a singleton. We could expose it as List.Zip.Syntax
, but it is definitely a different thing. (I also personally wouldn't expose it because I think the use of infinite lists in OCaml should not be encouraged -- Seq.t
is a much more natural type to use for this kind of applicative).
My personal feeling is that we are moving a little too fast in solidifying conventions for rebindable-syntax lets. For extensions and attributes, @alainfrisch did a lot of work on case studies, presenting and discussing several use-cases for the feature, and evaluating each of the proposed syntaxes/designs on them. Where are those examples with the proposed operators here? In the #1947 discussion there were some experience reports from ppx_let, but the syntax used there is not exactly the same, and some of the questions need to be handled differently.
What exactly are you hoping to learn from these examples that you wouldn't learn by looking at existing code using ppx_let
? Doing ad-hoc case studies isn't very helpful if you don't have some particular aim in mind or some hypothesis to test. I'm reluctant to spend my spare time manually converting other people's code to use this syntax without some clear objective in mind.
I'm unsure about moving more slowly on conventions. On the one hand, if we wait then perhaps some useful conventions will arise organically from the community and we can adopt them in the stdlib. On the other hand, we might wait and then have every different library using different conventions and find it is too late to try and get some uniformity via the stdlib.
Note that I already put some of the expected conventions into the manual, so if we do intend to wait and see then we might need to make some changes there as well.
I've definitely been playing a bit fast and loose with the subtle difference between these two approaches -- refering to let as monadic and let+ as applicative -- but really what I've implemented here is distinguishing between for binding and + for mapping. So it would indeed be reasonable for a functor that was not an applicative to implement let+ but not and+.
You are currently in the process of shaping the look and feel of monadic code (become more and more common) in OCaml for the years to come. I would very much like this look and feel to be explained precisely with appropriate examples and design guidelines. People new to monads should have some chance of following but, maybe even more importantly, people used to monadic or applicative code in their own extensions, or Haskell, or Coq/Agda/Idris, should be able to read whatever documentation and conventions we have and infer how to format their own knowledge/practices within those conventions.
In that respect, I think that being unclear about Functor vs. Applicative is not good, and that the various design choices you are pushing into the standard library should be clearly presented, discussed, and documented.
This is also why I've put in both
and+
andand*
one is for mapping over a product and the other is for binding to a product.
This discussion style is too abstract. Could you provide natural examples that look right with the current definition of and+
? Show them with and*
instead? Are there examples that would look nicer if and+
was the zipping one, instead of an alias for and*
? (How do I express the zipping product if and+
is cartesian?)
(I also personally wouldn't expose it because I think the use of infinite lists in OCaml should not be encouraged -- Seq.t is a much more natural type to use for this kind of applicative).
Zipping is useful as a modular way to implement mapN
operators, and as such it seems fairly common in lists. I've been working recently in parts of the codebase where List.map2 foo (List.combine bar baz)
is a common sight.
(We can build a 1-cyclic list, but I think we could also consider defining zipping so that zipping a one-element list with another length results in implicit extension rather than a failure.)
What exactly are you hoping to learn from these examples that you wouldn't learn by looking at existing code using
ppx_let
? Doing ad-hoc case studies isn't very helpful if you don't have some particular aim in mind or some hypothesis to test. I'm reluctant to spend my spare time manually converting other people's code to use this syntax without some clear objective in mind.
For one, I haven't seen any code using ppx_let. The only natural example I remember seeing is one of cmdliner usage with a custom let%map
extension, provided by @diml. Providing URLs to idiomatic ppx_let would be a useful first step.
As a user reading documentation on this feature, here are some questions that I would like to ask:
let*
and let+
are both available, why should I use one instead of the other? Can I see a let*
-using example that can be profitably rewritten with let+
or vice-versa?and*
(and/or and+
) rather than just a sequence of lets? Is it always useful, or only for some monads? Are the two always equivalent?Are there examples that would look nicer if and+ was the zipping one, instead of an alias for and*? (How do I express the zipping product if and+ is cartesian?)
The zipping operations are not part of the usual list monad. The relationship between a monad and its applicative is often given as (<*>) = ap
in Haskell land. This is equivalent to:
let+ f = f
and+ x = x in
f x
being equal to:
let* f = f in
let* x = x in
return (f x)
and clearly does not hold if you make and+
be the zipping operations.
Zipping is useful as a modular way to implement mapN operators, and as such it seems fairly common in lists.
Sure, but that doesn't make zipping a structural part of the list monad. Most useful operations on lists are not the core monadic operations.
(We can build a 1-cyclic list, but I think we could also consider defining zipping so that zipping a one-element list with another length results in implicit extension rather than a failure.)
(That seems pretty dubious to me, why would a list of length 1 be considered infinite but a list of length 2 be considered finite. But either way we're getting off topic here.)
Providing URLs to idiomatic ppx_let would be a useful first step.
As a user reading documentation on this feature, here are some questions that I would like to ask: [...] I'll come back to these when I have more time
For the submodule name, I suggest Option.Keywords
. It's the most immediately self-explanatory and precise name I've come up with so far.
My preference is to undo the let*
and let+
syntax, and to prefer specific decorators on each let
or other keyword instead, like proposed in https://github.com/facebook/reason/pull/2140. This would do away with the issue of opening modules to get operators in scope. So, some variation of:
let.Option a = b in
...
etc. We can probably make this prettier somehow.
We could have Option
select implementations from module Option
, or implicitly from a submodule Option.Keywords
, where the .Keywords
path component doesn't have to be written by the user.
IMO this is far more "clear" than let*
, especially if the implementation of let
in Option.Keywords
has to be called Option.Keywords.let_
. By contrast, the connection between let*
and opens requires non-trivial knowledge, as well as more analysis of the code being written.
@keleshev made a similar suggestion in https://github.com/ocaml/ocaml/pull/1947#issuecomment-409332752, from which we came up with decorating by a module (whether a good idea or not).
There are several additional reasons not to rely on open
s, most of which are out of my cache. One is that code with open
is more fragile to refactoring and copying/pasting.
Hi @aantron,
I hear you, but I think that everyone is wary of rolling back on the sort of consensus on the binding operators that we have managed to gather. Having some sort of first-class support for monadic notations has been discussed for at least 8 years now, and there was never enough support for one of these proposals to make it. This one passed barely, and as you can see in this discussion @lpw25 is still having a rough time. You should realize that rolling back on this one now will not make let.M
available at the same time (it doesn't have enough support for this), and is most likely to result in not having any syntactic support in the next release.
Personally I don't strongly believe that let.M
is an improvement over let-op
, so I don't feel that it is worth the risk; I would prefer to move forward with what we have than have nothing at all.
Why doesn't let.M
have more support? A few ideas:
As far as I am aware, the Reason folks never tried to communicate with us (the OCaml upstream) about their language design choices. If they want to coordinate language evolution (or compiler hacking, for that matter), they have to try a little harder than having someone randomly linking a PR once in a while. (I've said exactly that at the last ML workshop and maybe some communication will happen.) In the current state of affairs "Reason is considering doing X" doesn't give X much additional leverage.
let.M
has no clear connection between the syntax and the operator that is used under the hood. (In contrast, @keleshev's let.foo
proposal does have such a connection; they are quite distinct proposals and should be discussed separately.) This connection explains why some people have a marked preference for let-op
and and-op
, which is an extension of, say, .%()
and .%()<-
as user-defined operators.
One nice future-looking aspect of identifier-centric syntaxes (let*
and let.foo
) is that we can see how they would remain relevant in a world where OCaml has type-classes or a similar type-directed overloading mechanism for identifiers.
I hear you, but I think that everyone is wary of rolling back on the sort of consensus on the binding operators that we have managed to gather. Having some sort of first-class support for monadic notations has been discussed for at least 8 years now, and there was never enough support for one of these proposals to make it.
I would prefer to move forward with what we have than have nothing at all.
I find this line of argument unconvincing. Just because something's been wanted for a while doesn't mean that technical concerns should be pushed aside.
We've seen this situation before, but the case that really sticks in my mind is the proposal for non-recursive type declarations of the form:
type nonrec t = t
As soon as the nonrec
proposal was made, @alainfrisch pointed out that it didn't solve the general case:
it would also be useful to refer to a type defined above (especially in an outer module) in the module even if is has been shadowed [...] I guess it's worth thinking about a more general solution.
A little later @garrigue pointed out that the underlying issue was really about local names, not directly about recursion:
This said, I still have a strange feeling about this "nonrec" discussion, because this all boils down to questions of internal vs. external names, and they would be the only way to give a real solution to this problem.
I proposed an alternative syntax:
How about introducing a different binding symbol for simple non-recursive aliases?
type t module M = struct type t := t
However, even though nonrec
was evidently not the "right thing", it was pushed in anyway. Now, as of last month, we have a much better solution that is harmonious with the rest of the language, solves the general case @alainfrisch described, uses my proposed syntax, and is based on @garrigue's distinction between internal and external names. Unfortunately, we're still stuck with nonrec
as well, perhaps for ever.
So if @lpw25's proposal is the right design then it should be defended on technical grounds, not because the alternative is having to wait a little longer.
[Edit: this originally said that nonrec
was "rushed-in". I meant that it would have been better, in my view, to wait for a more complete solution to the problem it solved, but "rushed" is a bit misleading, since there was almost two years between the time nonrec
was proposed and the time it was merged.]
@yallop: fair points, but I would make a distinction between technical design discussions, such as the ones you are reporting in your post, and some syntax choices that are much more subjective/unprincipled (it's a beautiful and rare thing when concrete syntax has good justifications, my favorite example being match .. with exception
.) I'm questioning the design of and*
and and+
in this thread, but I haven't seen much objective discussion of various concrete syntax proposals, it's more of a matter of taste -- also important!
@gasche,
I'm not making any of these arguments:
I just think the let* syntax is not a good choice, due to reasons including:
I don't think these points are subjective, as they are directly related to the (predictable) structure of the thinking processes I (and others) have to undergo as readers and editors of code, also as learners of the language.
I know the module solution has its own drawbacks. However, I would strongly prefer not to have the let* solution, most importantly due to those of its drawbacks that I listed above. So, I agree with @yallop that this (ideally) requires some more design work and insight before it can be considered ready.
Somewhat less "meta," the module annotation is indeed less clear than @keleshev's proposal. However, it is arguably more clear than let*
:
let_
and and_
.I see the concern of how this would benefit from modular implicits or other overloading schemes. I agree that the module solution is weak here. However, this is by design. I pretty strongly believe that we should allow explicit syntax even in an OCaml that has overload resolution. The let* solution forces all monadic code using syntactic sugar to rely on opens or future overload resolution.
Perhaps this means that we need to prepare both an explicit and an implicit syntax. In current OCaml, however, it is (IMO) an explicit syntax which is more valuable. An implicit syntax only becomes highly valuable with overload resolution, and right now we are using open
effectively as a crutch in the absence of that.
The module proposal (and @keleshev's proposal) are in the explicit direction. I am not claiming that either of these is the right point in the design space.
The difficulty around and
can be explained as follows. For any binding construct there are essentially two forms: a bind
form and a map
form. The first allows the body of the construct to contain effects that depend on the bound value, whist the second does not. This extends beyond let
to any other binding constructs that we might add operators for in the future (match
, if
, for
, etc.). Each of these has two useful forms, and the syntax needs to allow you to choose the one you want. In the case of functors/applicatives only the map
form is available, but even for monads the map
form can have much better performance than using the bind
form with return
.
So in this proposal I'm using *
to indicate the bind
form and +
to indicate the map
form. However, for and
there is no such distinction. It is not actually a binding construct: it is a combinator that is applied before the value is bound. This means the same combinator is used with both map
and bind
, so we do not need to mark and
with +
or *
to distinguish the two cases. However, we do need to name the and
operator something. Which basically leaves us with three choices:
+
and *
to use with and
and
+
and *
names for and
so that users can just use the and
that matches their binding construct without thinking about it.In this PR I've implemented the third option.
Note that this issue is not unique to and
. If we add more of these operators for other constructs in the language we are likely to need some other combinators as well. For example, I would expect to need an exception
combinator if we wish to allow using exception patterns with the match*
operators proposed in #1955.
As to suggestions of other syntax. Most of the options are inferior to the current implementation, as discussed in the issue that implemented it. Fundamentally, we need to write letfoo
for some foo
and then relate that to some function bar
that implements the operation. Some desirable properties are:
foo
whether we want the binding or mapping functionfoo
and bar
should be derivable from the syntaxletfoo
and matchfoo
might require different bar
s.letfoo
for all different computation types.These properties heavily favour having bar
being equal to letfoo
and using the ordinary scoping rules for obtaining the bar
when given letfoo
. At that point we are simply left with the choice of what to allow for foo
. This is essentially a matter of taster. The previous PR went with operator symbols. Another reasonable option would be allowing /ident
.
I prefer the operators because the point of monadic syntax is allow users to treat a particular monad as the ambient monad of their computation. The distinction between binding, mapping and non-monadic let
s should be clear but it shouldn't be over-emphasized. I suspect that much of some people's preference for using longer identifiers is an instance of Stroustrup's rule. I also use refer to the experience of computation expressions in F#, where they use operator symbols and where people seem pretty happy with the readability of resulting code.
@aantron Why do you think that it harder given:
Foo.Syntax.(
let* x = a in
b
)
to relate the let*
to Foo.Syntax.( let* )
than given
let.Foo x = a in
b
to relate the let.Foo
to Foo.bind
(or is it Foo.map
)?
In the first we rely only on the standard lexical rules of the language, and the convention that we keep syntactic operators in modules called Syntax
. Whereas in the second you rely on some ad-hoc rules about which operator maps to which function.
Mainly because the Foo.Syntax
can be quite far-removed from the let*
that it affects, in particular when in some "realistic" code a
is actually a complex expression that spans many lines. Arguably, it might be better to avoid such a replacement of a
, but I would prefer not to weakly impose that on users due to what could be called an insufficiency of the surrounding syntax in the opinion of some :)
So it's not a matter of just the convention of which keyword maps to which function. This comparison between the approaches is not on one axis:
let
to function let_
, maybe .Syntax.let_
.let*
.To maintain clarity about my position, I don't believe the module is the ultimate solution here. I do believe it's important to address (2) in some way if reasonably possible.
The verbosity constraints for that feature is pretty tight. If the let operator is too verbose (and the module qualification is very verbose), it immediately becomes less interesting than just using the bind operator manually. When we migrated the lwt syntax extension from camlp4 (lwt x = ..
) to ppx (let%lwt x = ...
), several "advanced" users complained about the increase in verbosity and switched back to using >>=
.
While I also dislike the slightly obtuse aspect of using operators, let operators from different monads are very rarely mixed in the same context, and a given let operator is used repeatedly in the same scope. That's why most do-like notations make the user pick a monad for a whole scope, not for an individual lets. In that context, having to specify the complete module name for each let binding in a scope is fairly punishing.
Mainly because the
Foo.Syntax
can be quite far-removed from thelet*
that it affects, in particular when in some "realistic" codea
is actually a complex expression that spans many lines.
That is certainly true note however that the actual expressions bound by the keywords and the expression body will certainly provide strong hints as to which syntax is being used (which is also the reason why I would prefer if return
was removed from the proposal).
I would also add that as code-comprehension tools become more common, our editors are now able to give us either the type or the definition of a binding operator, which resolves any potential ambiguity.
Is the M.let* x = foo in bar
syntax available? (For indexing operators we have foo.M.%(bar)
.). If it were, we could easily propose an ocamlformat mode that adds explicit quantification to each binding or indexing operator for code-reading purposes.
to relate the let.Foo to Foo.bind (or is it Foo.map)?
Sorry, I did not follow all discussions around this proposal, but has it been discussed to relate let.Foo
to "identifier" (let)
in module Foo
(assuming the parser interprets (let)
as a valid lident, this would be written Foo.(let)
).
@alainfrisch how do we apply (let)
then, let. x = t in u
?
I've been thinking more about and+
vs. and*
. Remark that the following code:
let+ x = a in
let+ y = b in
t
has type 'a t t
:
# let _let_map li f = List.map f li;;
val _let_map : 'a list -> ('a -> 'b) -> 'b list = <fun>
# _let_map [1; 2] @@ fun x ->
_let_map [3; 4] @@ fun y ->
(x, y);;
- : (int * int) list list = [[(1, 3); (1, 4)]; [(2, 3); (2, 4)]]
If we view the use of an and
-form instead of the second let+
as collapsing this to an 'a t
, then this is exactly a monadic (not functorial or applicative) operation. In particular, I don't expect an equivalence between nested-let+
and and+
to hold in general: for some applicative functors that can define let+
and and+
but are not monads, it will not hold.
Personally I think that and*
is a more natural name for such a layer-collapsing cartesian-product operation (it is a "product"). This would free the usage of and+
for a parallel-iteration operation, in the spirit of zipping ("+" for parallelism is also a natural symbolic choice).
I think that the two operators (cartesian vs. parallel product) only differ for "multi-shot effects" like List, not for linear or affine effects like Option or State or Lwt.
how do we apply (let) then, let. x = t in u?
@gasche Not sure if this was a rhetorical question, but let._ x = t in u
could work in that case, picking the in-scope (let)
.
Still thinking about and*
and and+
, I think there are three design choices.
Not have a user-definable syntax for and
-operators to remove that difference. As Leo points out, this is syntactically awkward (it's doable if let*
is a record, but this proposal was rejected as not-the-standard-interface.)
Keep them both the same thing when let+
and let*
refer to different operators (map and bind) over the same monad.
Use and*
for a cartesian product, and and+
for a parallel product.
For single-shot or at-most-one-shot effects (Option, Lwt, CmdLiner...), I think that there is no difference between (2) and (3). The main difference comes from uses of let+ ... and+
to describe comprehension on sequences -- List, Seq, Array.
Personally I would find (3) more natural, and it is also more expressive. It has the following drawbacks:
let .. and
block.To illustrate the second problem, consider
let+ x = inputs
and+ y = outputs
and* c = checks
in c x y
which maps over inputs
and outputs
in parallel, performing each check for each parallel (i, o)
pair, and
let+ x = inputs
and* f = transforms
and+ y = outputs
in (f x = y)
which iterates over all possible pairs (x, f)
of an input and a transformation function, computes the application f x
and compares each result to the corresponding value in outputs
. In both cases, the result list has length |inputs| * |checks/transforms|
, but in one case outputs
must have size |inputs|
and in the other it must have size |inputs| * |transforms|
.
Remark: @lpw25, if I understand correctly the let+ .. and+
syntax (even with parallel-and instead of cartesian-and) doesn't require defining a pure
operation: what is typically written pure f <*> a <*> b
becomes let+ x = a and y = b in f x y
, so the pure
is gone, and most use-cases of zipping never actually require to worry about infinite lists.
(cc @yallop and @stedolan)
Note: I tried to understand and+
better from the code examples given by @lpw25, but I think they are not enough for this. let%map .. and ..
is rarely used, and I have only found uses for single-shot effects like CmdLiner or Incremental.
Note: the reason why and+
raises questions and and*
does not is that product is the central operation for applicatives, not for monads where it is derived in "the canonical way that corresponds to nested let*
".
how do we apply (let) then, let. x = t in u?
One could write of course (let) t (fun x -> u)
(and then discuss about providing a shorter syntax for simple function literals..). I'm not sure it's worth providing a way to use the "binding syntax" for the current scope. The way to define a custom binder locally, if really needed, would be through modules:
let module M = struct let (let) e f = ... end in
let.M x = ... in ...
or if one wants to use locally in the current module and also export:
module M = struct let (let) ... end
include M
(I suspect that modular implicits would require a similar "approach".)
let.M
does maintain a one-to-one relationship between the "letfoo" and the "bar" from my discussion above. However, it will require you to write let.Option.Bind
and let.Option.Map
-- which is extremely verbose. If instead you use let.Bind
and let.Map
then you are just back to using open
and you'd be better off with operators or let/bind
-style identifiers.
Sorry for opening a new can of worms here, but I want to raise this question: does it make sense to make OCaml extra hospitable for monads, when we're looking forward to typed effects, which are mostly isomorphic to monads but cleaner?
Here is a property (maybe even a specification) of the parallel product, in term of commutation with the builtin product of the language:
parallel-prod
(map m (fun x -> t1))
(map m (fun x -> t2))
=
map m (fun x -> (t1, t2))
If (and+)
was the parallel product, this property would, in particular, guarantee the following equality between bindings
let+ x = m
and+ x = m
in t
=
let+ x = m
in t
Edit: the proposed property can be expressed as: (and+) m m = (let+ x = m in (x, x))
. In contrast, we have (and*) m m = (let* x = m in (let* y = m in (x, y)))
One option that I think wasn't mentioned that avoids the need to define both (and*)
and (and+)
is to define only (and)
, which would be used by both (let*)
and (let+)
. Unlike (or)
, (and)
is currently not a valid operator, so I think that would work. To call the operator directly the (and) x y
syntax could be used. The potentially confusing part is that x and y
wouldn't be valid.
One option that I think wasn't mentioned that avoids the need to define both (and*) and (and+) is to define only (and)
I think this is probably my preferred option. It's slightly less obvious in:
let* x =
...
and y =
...
in
...
that the binding of y
is special, but it is a better reflection of how and
relates to map
and bind
. It is also how ppx_let
looks and it seems to work pretty well there. I'd still like to leave in support for and
operators other than and
since I have a few weird use cases that require multiple distinct and
operators, but adding support for (and)
as well is easy to do.
Let me play the adversary and criticize the (and)
idea (which does have merits).
It is inconsistent with other user-defined operators, which are all distinct from builtin constructs and keywords: we did not allow rebinding (.())
or (.()<-)
or (let)
-- rebinding (let)
would be a bad idea as it would make it too hard for readers to know that something special is going on. (and)
is doable because it must be marked by a non-standard let
-class operator, but inconsistent with this global trend.
In a world where different and<op>
operators could be used in the same binding, we would have a use for vanilla and
, which is to signify a simple binding. Haskell's do-notation has monadic bindings of the form x <- e
and standard bindings of the form let x = e
; in set comprehensions, it is not uncommon to wish to alternate between x in foo
forms and x = bar
forms.
That said, I don't see how to extend the existing desugaring for let/and to accomodate vanilla and x = foo
bindings that should result in a simple let
. (If we know of a return
operator we can desugar into and* x = return foo
, but our syntax precisely does not assume a return exists.)
@gasche Perhaps it would help if I wrote out the various laws that the operators should obey. Apologies for the rubbish markdown formatting (edit from trefis: reformated).
let+
is equal to | ||
---|---|---|
Identity |
let+ x = v in
x
|
v
|
Composition |
let+ y =
let+ x = v in
g x
in
f y
|
let+ x = v in
f (g x)
|
and
is equal to | ||
---|---|---|
Naturality |
let+ x = u
and y = v in
f x, g y
|
let+ a =
let+ x = u in
f x
and b =
let+ y = v in
g x
in
a, b
|
Left identity |
let+ x = return a
and y = v in
y
|
v
|
Right identity |
let+ x = u
and y = return b in
x
|
u
|
Associativity |
let+ x = u
and (y, z) =
let+ y = v
and z = w in
y, z
in
f x y z
|
let+ x = u
and y = v
and z = w in
f x y z
|
let*
is equal to | ||
---|---|---|
Left identity |
let* x = return a in
f x
|
f a
|
Right identity |
let* x = m in
return x
|
m
|
Associativity |
let* y =
let* x = u in
f x
in
g y
|
let* x = u in
let* y = f x in
g y
|
Compatibility with `and` |
let+ x = u
and y = v in
f x y
|
let* x = u in
let* y = v in
return (f x y)
|
(and) is doable because it must be marked by a non-standard let-class operator, but inconsistent with this global trend.
True but I think this is basically fine.
Haskell's do-notation has monadic bindings of the form x <- e and standard bindings of the form let x = e; in set comprehensions, it is not uncommon to wish to alternate between x in foo forms and x = bar forms
They do, but they don't allow you to use the different forms in a single binding -- partly because they don't have support for binding multiple things at once. Whilst it is often useful to alternative between let*
and let
, I don't think it is ever particularly useful to have:
let* x = monad
and y = not_monad in
f x y
because you could always use:
let y = not_monad in
let* x = monad in
f x y
or
let* x = monad
and* y = return not_monad in
f x y
instead. I also think it is less confusing to have all the defining expressions for let*
or let+
expression be monadic.
This PR provides let operators (see #1947) for
List
,Option
,Result
andSeq
. These operators are placed in a sub-moduleSyntax
in each of these modules, which also includes a definition ofreturn
.This enables code like:
I'm open to other suggestions for the name
Syntax
. I came up with:Computation
,Comp
,Expression
,Expr
andSyntax
, from which I likedSyntax
the best. I would say that the name should not be based around the word "let" because I suspect that in the future we will want to put other operators in there (e.g.match*
from #1955).Some of the required functions for these operators did not previously exist. None of the modules had their monoidal product and
List
was missing its bind. Of these I've exposed the product forList
andOption
, and the bind forList
. I didn't expose the product forResult
orSeq
because I didn't feel people were likely to want to use these functions outside of maybe using them asand
operators.The documentation of the newly exposed functions is very minimal so improvements there are welcome.
Decision queued on: