Closed paulch42 closed 6 years ago
I don't mind brackets in function applies and definitions, though perhaps that's just a matter of the languages I'm used to.
The syntax of an "apply expression" is currently used for map lookups, sequence indexing and function application. Are you suggesting all of these should change, or that bracketless-applies would only work with function application?
Are you saying that "f(a)(b, c)" would change to "f a b c" and the type checker would have to work out the argument grouping? Currently, this parses as two applies, but it looks like you're saying it would parse as three applies now? Does that mean a function can only really have one argument? (I do see how that relates to our earlier discussions about VDM's parameter types.)
The syntax of an "apply expression" is currently used for map lookups, sequence indexing and function application.
and operation call statements look very similar to specifiers.
We should be very careful about semantic impacts of this RM. Will the type system still be sound if we add D[T1 ... Tn -> T] = D[T1 -> ... -> Tn -> T] ? I'm not sure.
For someone coming from a less syntax heavy functional language like Haskell, the parentheses look clumsy. At least, they do to me. I thought about adding maps and indexing to the RM as an optional component, but in the end decided against it as they don't cause me concern. But perhaps this is a topic for discussion.
If you typically use multi-argument functions then this is not much of a change. The real benefit appears with higher order functions, which is why I say in the RM it could encourage greater use of higher order functions in VDM.
No f(a)(b,c)
would be f a (b,c)
. The type checker does not have to work out argument grouping, that is already handled by the parser. Note in the proposed change to the syntax rules the expression and pattern lists must still be enclosed in parentheses at all times.
To be clear, there is no intent to somehow identify multi-argument functions and higher order functions. The types T1 * ... * Tn -> T
and T1 -> ... -> Tn -> T
are distinct, and remain distinct. Functions of those types are defined differently from a syntactic perspective. There is no change to semantics. It is just a syntactic simplification where precedence rules allow it.
Any specific instance of this, on its own, is of little benefit. This is one of these situations where the benefit arises from repeated use providing an overall gain of reducing the syntactic baggage.
To be clear, there is no intent to somehow identify multi-argument functions and higher order functions. The types T1 * ... * Tn -> T and T1 -> ... -> Tn -> T are distinct, and remain distinct.
Okay, I understand.
One tricky example is f g(x) y
, which could be parsed as either f (g(x)) y
or f g x y
, depending on which rule would have more priority, expression, '(', [expression list], ')'
or expression, expression
.
I think your example should be parsed f g x y
. To me that is the consistent interpretation of the left associativity of function application. Similarly, f g(x,y) z
would be parsed as ((f g) (x,y)) z
.
Yes, f g x y
is a valid left associative parse of the whitespace operator for function application. However, g(x)
parsed by the rule expression, '(', [expression list], ')'
is another form of function application, which is not the left associative whitespace operator.
I think we need to define precedence level for applicators in Section C.2 to eliminate ambiguity.
If you want f g(x) y
parsed as f g x y
, function applications by whitespaces should have higher precedence than function applications using parentheses.
According to my understanding of the LRM all the information is already there.
By C.2 all applicators have the same precedence. By C.7 all applicators are left associative. Does that not mean f g(x) y
is parsed as ((f g)(x,y)) z
?
Nick alludes to a brief discussion I had with him related to this topic since there is something about function application in VDM that I find rather strange. I think I will repeat the point on core to get some discussion going. I didn't put it in the RM because it involves rather more fundamental aspects of VDM that would complicate the issue.
Paul Chisholm
The syntax of function application.
VDM requires that every argument to a function, when applied, is wrapped in parentheses. While this is essential for functions of multiple arguments, for functions of a single argument we generally must include parentheses that contribute nothing of real value, only serving to complicate the syntax. Modern functional languages have addressed this situation by not requiring that function arguments be enclosed in parentheses (subject to the rules of syntax precedence).
For example, a function of three arguments would be invoked with f(a,b,c)
as a multiple argument function, while if it was defined as a higher order function it would be invoked by f(a)(b)(c)
, and if as a higher order function but parentheses were not mandated, it could be invoked by f a b c
.
Higher order functions in VDM can employ single arguments and multiple arguments at different positions. By C.2 of the LRM, all applicators have the same precedence. By C.7 of the LRM all applicators are left associative. This means that when parsing of an expression that applies a higher order function to many arguments, some single arguments and some multiple arguments (i.e. expression lists), the arguments are handled identically.
We have
f (a) (b,c) = f a (b,c) = (f a) (b,c)
However
f a (b,c) <> f a b c
In this case the two sides of the expression are of different types.
Another example
f g (x) y = ((f g) (x)) y = f g x y =((f g) x) y
If the intent is to apply g
to x
and pass the result to f
we require
f (g x) y = f (g (x)) (y)
Note the association and precedence does not change regardless of whether the redundant parentheses are included. In the current VDM specification the expression
f (x,y) (z) (a,b) (c) = (((f (x,y)) (z)) (a,b)) (c)
and under this proposal omitting parentheses
f (x,y) z (a,b) c = (((f (x,y)) z) (a,b)) c
The proposal is purely syntactic simplification.
Additionally, parentheses need not be required on the left hand side of a function definition. So for example, we could have
sum: nat->nat->nat
sum m n == m + n;
whereas current syntax requires
sum: nat->nat->nat
sum(m)(n) == m + n;
It is interesting that in languages such as Haskell higher order functions are the norm, whereas in VDM functions are typically defined as multiple argument functions. The benefit of higher order functions is the flexibility gained through partial application. While this is possible in VDM, the more cumbersome syntax perhaps discourages the use of higher order functions.
Relax the syntax such that function arguments (in both definition and application) need not be wrapped in parentheses where precedence rules allow.
Change definition of parameters
(LRM A.4.4) to
parameters = '(', [pattern list], ')' | pattern;
Change definition of apply
(LRM A.5.13) to
apply = expression, ('(', [expression list], ')' | expression);
None. This proposal only generalises the syntax of function definition/application. All existing specifications remain syntactically valid. No change to semantics.
The proposal relates the definition and application of functions. The same syntax is used for sequence application and map apply. Consideration can be given to whether these two forms should also support a parenthesis free form.
Will be provided if Language Board accepts change request.
Does this RM need the parser to become aware of the type of the function that it is parsing?
So if currently we say f(a,b,c)
or f(a)(b)(c)
, the parser does not need to know about the definition of f and whether it is curried or not. The type checker comes later to check that. But if this can be expressed as f a b c
with this RM for both curried and uncurried cases, then I'm not clear how the parser can decide to make this a single apply with three arguments, or a nest of three applies, each with one argument - at least not without knowledge about the curried nature of the definition of f.
I can see how "it doesn't matter" because the end result is the same (true?) but if we parse things differently after the RM (the AST is not as it would have been without the RM), then this is more than just syntactic sugar.
If we apply this RM to seq and map as well as function, I think it's just a syntax sugar. If we don't, I think we need another AST node than the "Apply" node or a "flag" in the Apply node so that the type checker can spit an error when the lefthand is not typed a function. I don't see any other ripple-effect.
With regard to "Optional component" in the RM, I propose to apply this RM to seq and map applications. It'll keep consistency with unary function application, and give clear lexical distinction from operation calls. Unary functions, maps and sequences are similar kind. Operations are different from them. I'd like to hear opinions about this.
@tomooda I agree, we should include the optional component for better consistency.
@nickbattle Your comment about f a b c
being interpreted as a single apply with three arguments. I don't understand this. f a b c
to my mind can only be curried application of a higher order function (f : @a -> @b -> @c -> @x
). The only way to have a single application with three arguments is by the syntax f(a,b,c)
where f : @a * @b * @c -> @x
.
OK Paul, I misunderstood one of the paragraphs in the proposal. So the parser can remain unaware of the types of the symbols, which is obviously desirable. I haven't looked at how tricky the parse would be, but I think it's unambiguous, so it should be relatively simple.
I'm not sure why you think we need a flag in the Apply node, Tomo? Applies are already used for map and sequence operations as well as function application. Or are you saying that we would have to do this if the new syntax was only applicable to functions?
Nick, Yes, I wrote we will need a flag only If we apply this RM to seq and map as well as function. Otherwise we don't have to distinguish if the arg is in a pair of parentheses.
Looking at the parser, there is this comment in the source (when we have read a name followed by a bracket):
// Either sequence(from, ..., to) or func(args) or map(key)
// or mk_*(), is_*(), mu(), pre_*(), post_*(),
// init_*() or inv_*()
Are we saying that mk, mu and is are unaffected, or can they use the new syntax too? We have to be careful because in some cases the arguments are not expressions (eg. is_ takes a type argument, and mu takes a "bare" maplet).
The new syntax can’t apply to mk_
(tuple constructor), mu
or is_
because they are multi-argument functions hence need the parentheses. The only place it could possibly be used is with mk_
(record constructor) where the record only has one field. If there are more fields then it is like a multi-argument function and needs the parentheses.
You could go down the path of allowing curried record construction in which case the parentheses free form is useful, but that is straying somewhat beyond the original proposal and has more wide ranging effects.
The real benefit of the proposal is in the definition and application of user defined higher order functions.
I am strongly opposed to this RM, because:
Higher order functions are powerful but should therefore also be used with extreme care. They are rarely used in specifications because they usually add complexity rather than removing it. I sincerely doubt that the perceived improvement of the construct in VDM as proposed makes VDM more user friendly.
I have to say I agree with Marcel on this one. Don't fix something that ain't broke.
I think that what one considers a readable syntax depends significantly on what computing languages one has grown up with. I agree with Marcel that for me the readability is decreased but I think that this is because I have grown up at a point of time where the brackets was commonly used (most extremely in Lisp). I would think that it would be worthwhile to test how readable the new suggested syntax looks for average newcomers. Personally I would stick to the "old" syntax but that would also be possible with this RM. As far as I understand from the comments made from Nick the parser and type checker is not more complicated with the new syntax. If you like I can try to give a selected number of syntaxes to a group of students that have never seen VDM before and then ask them what their intuition tells them that the syntax mean. If it generally decreases readability for newcomers I would agree with Marcel and being in change of rejecting this RM.
That sounds like an interesting experiment, but even if your students have not seen VDM before they would still be biased to some extend depending on the other languages they have been exposed to before. I think that for people more familiar with functional languages the notation without brackets is probably more intuitive. Question is: is that the target group of VDM?
Nico Plat mail@nicoplat.com
Op 17 dec. 2017, om 14:24 heeft Peter Gorm Larsen notifications@github.com het volgende geschreven:
I think that what one considers a readable syntax depends significantly on what computing languages one has grown up with. I agree with Marcel that for me the readability is decreased but I think that this is because I have grown up at a point of time where the brackets was commonly used (most extremely in Lisp). I would think that it would be worthwhile to test how readable the new suggested syntax looks for average newcomers. Personally I would stick to the "old" syntax but that would also be possible with this RM. As far as I understand from the comments made from Nick the parser and type checker is not more complicated with the new syntax. If you like I can try to give a selected number of syntaxes to a group of students that have never seen VDM before and then ask them what their intuition tells them that the syntax mean. If it generally decreases readability for newcomers I would agree with Marcel and being in change of rejecting this RM.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/overturetool/language/issues/43#issuecomment-352255419, or mute the thread https://github.com/notifications/unsubscribe-auth/AI-uHwdI81zTI3wiB-l0s2M63oIQJdl0ks5tBRYLgaJpZM4Onmjj.
A familiarity issue might occur when f(x)
style is exposed to a user who is familiar with f x
style and vice versa. This could be resolved by tool (or environment) that automatically formats a source specification into a preferred style. Auto-formatter (aka pretty printer) is one of common functions of modern IDEs. I think auto-formatter is fairly easy to implement on top of IDE that manages AST.
Deeper change would happen between f(x, y)
style and its curried form f x y
. This would bring more difference because they are typed differently (the former is @t1 * @t2 -> t3 while the later is @t1 -> @t2 -> @t3). This characterizes this RM as an inducer to higher order functions, which I expect more reusable utility functions (such as map, filter and reduce functions) could be defined in.
There have been a number of comments on this RM so I will respond to all in this email.
I have a background in functional languages so I do find the necessity to wrap every argument of repeated application of a higher-order function in parentheses to be syntactically clumsy and a hinderance to understanding, hence why I raised the RM. The parentheses free form is the normal approach in languages such as Haskell, ML, Miranda, Idris. F# and others. It doesn’t prevent those who prefer the parenthesised from using them, but does allow those who do not to omit the parentheses. Is this our audience? I would say our audience is anyone we can get interested in VDM regardless of their background.
I think encouraging the use of higher order functions is a desirable goal. They are one of the most fundamental tools available to the functional programmer due to the ability create concise, flexible and modular programs. It was probably best explained by John Hughes in his seminal paper “Why Functional Programming Matters”. If you were to look in the Haskell Prelude (a comprehensive standard library) you will find higher order functions are everywhere and always used in preference to multi-argument functions (strictly Haskell does have the concept of multi-argument functions, a tuple serves the same purpose). I don’t agree that higher order functions add complexity. As argued by John Hughes, they can be used to reduce complexity. Note additionally that in the Haskell Prelude function arguments are rarely wrapped in parentheses, except when necessary.
To me this RM proposes no more than a simple syntactic generalisation: allow the omission of parentheses around a function argument where precedence rules allow.
On 18 Dec 2017, at 10:16 am, Tomohiro Oda notifications@github.com wrote:
A familiarity issue might occur when f(x) style is exposed to a user who is familiar with f x style and vice versa. This could be resolved by tool (or environment) that automatically formats a source specification into a preferred style. Auto-formatter (aka pretty printer) is one of common functions of modern IDEs. I think auto-formatter is fairly easy to implement on top of IDE that manages AST. Deeper change would happen between f(x, y) style and its curried form f x y. This would bring more difference because they are typed differently (the former is @t1 https://github.com/t1 * @t2 https://github.com/t2 -> t3 while the later is @t1 https://github.com/t1 -> @t2 https://github.com/t2 -> @T3 https://github.com/t3). This characterizes this RM as an inducer to higher order functions, which I expect more reusable utility functions (such as map, filter and reduce functions) could be defined in.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/overturetool/language/issues/43#issuecomment-352293913, or mute the thread https://github.com/notifications/unsubscribe-auth/AQisRkA2IriulNtsYxOPn3f9m15mB2igks5tBaDWgaJpZM4Onmjj.
The fundamental problem I have is that additional semantics is assigned to whitespace - imho bad practice for building compilers, in particular if you want to use compiler generators (like antlr). Furthermore I should not need to rely on a tool to make the code understandable or to increase readability. The terse syntax of functional languages like haskell is to me also the main reason for their lacking acceptance as their syntax simply is too terse. Punctuation symbols exist for a reason, in particularly so in computer languages. Even making it optional is not a good idea as it breaks backward compatibility and it does not add any expressiveness. For me parenthesis are what indention is in python. Instant visual help to understand how I should read the code / specification. Leaving them out is not a good idea. This RM should really be rejected IMHO.
Marcel, we've already been relying on tools to author and comprehend VDM specs. I think that's the reason why we choose IDE-based environments that is equipped with syntax highlights, auto-indentation and as-you-type syntax/type checking.
I like the idea of the IDE auto-formatter since that would enable both sides to be satisfied, but unfortunately I believe that this first need us to resolve how we treat comments in the AST. A few years ago I had a couple of MSc thesis students looking into providing refactoring support inside Overture and in order t do this optimally one needs to support automatic layout of VDM models but then this was held back by the fact that the comments are not included in the AST at all (as far as I understand).
The readability argument is not at all tool related. Why are we even considering an RM that has no advantage from the expressiveness point of view and obviously complicates everything? I really fail to see any argument in favour for this RM.
Peter, reproducing comments is a matter of source traceability. You can find which AST node a comment is on by comparing source positions of AST nodes and the source position of the comment. (VDMPad's formatter takes another approach to reproduce comments)
All the proposal does is to generalise syntax, redundant parentheses can still be added, if desired. I don't think that the proposal decreases the readability of specifications, in most cases it does the exact opposite.
Marcel, higher order functions are very expressive and the proposed syntax reduces burden of multiple-nested parentheses. As the purity of functions is ensured in VDM10, I think it's good time to support more use of higher order functions and parametric polymorphism.
Marcel, I think that the issue here is tool related in the sense that for people with different background different syntaxes are most readable. For you and I the syntax with brackets are indeed most readable, but for "younger" people who for example have been brought up with other computer languages a syntax without the brackets may be more readable. Thus, if it is possible to support both by intelligent tool support automatically transforming the syntax of a VDM model into the syntax that is most readable to you (as a setting where you would choose to see things with or without the brackets).
I've read the entire thread once more and I remain strongly against this proposal. The proposers have only given subjective arguments why the change to the syntax is a good thing. So far, all my (imho objective) arguments against have basically been ignored or waived away without actual argumentation: decreasing readability, increasing parsing complexity, no added value for expressiveness. As Nico puts it rightly: don't fix something that isn't broken!
Thank you all, for your input on this RM.
Based on the community discussion, the Language Board (LB) has decided to reject this RM.
At today's LB net meeting, 5 members voted against this RM, 1 member voted in favor of the RM, and 1 member abstained from voting.
Identification of the Originator
Paul Chisholm
Target of the request, defining the affected components of the language definition
The syntax of function application.
Motivation for the request
VDM requires that every argument to a function, when applied, is wrapped in parentheses. While this is essential for functions of multiple arguments, for functions of a single argument we generally must include parentheses that contribute nothing of real value, only serving to complicate the syntax. Modern functional languages have addressed this situation by not requiring that function arguments be enclosed in parentheses (subject to the rules of syntax precedence).
For example, a function of three arguments would be invoked with
f(a,b,c)
as a multiple argument function,f(a)(b)(c)
as a higher order function, andf a b c
as a higher order function if arguments need not be enclosed in parentheses.Additionally, parentheses need not be required on the left hand side of a function definition. So for example, we would have
whereas current syntax requires
It is interesting that in languages such as Haskell higher order functions are the norm, whereas in VDM functions are typically defined as multiple argument functions. The benefit of higher order functions is the flexibility gained through partial application. While this is possible in VDM, the more cumbersome syntax perhaps discourages the use of higher order functions.
Description of the request
Relax the syntax such that function arguments (in both definition and application) need not be wrapped in parentheses where precedence rules allow.
description of the modification
Change definition of
parameters
(LRM A.4.4) toparameters = '(', [pattern list], ')' | pattern;
Change definition of
apply
(LRM A.5.13) toapply = expression, ('(', [expression list], ')' | expression);
benefits from the modification
possible side effects
None. This proposal only generalises the syntax of function definition/application. All existing specifications remain syntactically valid. No change to semantics.
A test suite for validation of the modification in the executable models (if appropriate).
Will be provided if Language Board accepts change request.