Closed paulch42 closed 7 years ago
Some initial thoughts:
I think you can achieve what you want using the elems
operator, which converts a sequence into a set. Did you consider that? For example:
[x | x in set elems [1,2,3]] = [1,2,3]
Also, what would the iteration order be for an expression such as e in seq ms
?
Interesting idea. I agree that such map and filter to a sequence is a little tedious in VDM. Paul, have you considered introducing "map" and "filter" operators?
The LB has decided to advance the RM. Luis Diogo Couto has been appointed the owner.
A couple of initial thoughts.
As PJ says, we do have to think about the definition of the ordering if "in seq" is used in a list comprehension. [x | x in set S] gives a sorted sequence on the elements of S, so does [x | x in seq S] do the same, or does it preserve the order in S, or something else?
Presumably "in seq S" is loose, in the sense that "let e in seq S be st p(e)" does not define the element chosen? That implies that we should also have an "in seq" traces expression - most loose expressions have traces equivalents that expand the looseness to all the possibilities.
The list already appears in a specific order (possibly with duplicates). I think comprehension should respect the order and multiplicity. That is, I believe the semantics of
[ f(x) | x in seq s & P(x) ]
should be the same as
[ f(s(i)) | i in set inds s & P(s(i)) ]
@peterwvj
@nickbattle
Preserving the order of the source sequence allows the feature to be used more widely. If we are constrained to sorting the sequence then it has to be a numeric sequence (as is the case when sequence comprehension is used with a set as the source). Consider the example from the Modelling Systems book referenced in the proposal. The 'in seq' version of that is not valid unless we allow comprehension over sequences of any type.
@tomooda
I did consider 'map' and 'filter'. I think your point is that sequence comprehension with 'in seq' as a source is a combination of 'map' and 'filter'. I find the comprehension version is clearer and more immediate than the functional version, so is preferable from the perspective of communicating a specification.
Paul, OK, I got it. I have another couple of questions.
sequence for loop = ‘for’, pattern bind, ‘in’, expression, ‘do’, statement ;
I just wonder which looks better, pattern 'in seq' expression
or pattern bind 'in' expression
.
The former takes the form in your proposal, and the later would look like
[ f(x) | x : t in xs & p(x) ]
or simply [ f(x) | x in xs & p(x) ]
.
Any thoughts?Tomohiro,
Using the ordering of the "in seq" sequence does make sense, and frees us from the need to bind to a single numeric value, though I think we are still limited to binding a single value (or a pattern that matches a single sequence value). For example, we cannot have "a,b,c in seq S" because if all the variables bound to all possible sequence value combinations (as they would with sets) then we lose the definition of the ordering. But we could have "mk_(a,b) in seq S", assuming the elements were pairs. That just affects the syntax/parser (pattern bind rather than multi pattern bind, or something like that).
I assume we would also say, as with sets, that bindings which don't match sequence values (eg. wrong type) are ignored (though currently, if you don't match a numeric value it is an error, so that can be relaxed as we no longer need numerics).
I agree that non-matching values should be "skipped".
In general, I think sequence comprehensions would be a nice addition to the language. I have struggled with precisely these issues so often that I wrote generic map and filter functions. But they look really ugly...
I'm less certain on the need of in seq
and not in seq
operators. You can get the same behaviour with in set elems
.
Regarding the for loop statement, I don't think it should be changed as part of this RM. It's a completely separate thing.
Thinking about it, the restriction on a single pattern bind with "in seq" would only apply if it was used in a sequence comprehension (because of ordering). In a set comprehension, it would make sense to allow multiple binds with "in seq" just like "in set". Perhaps that's obvious, but when I made that earlier comment I was thinking it was because of the way "in seq" works, but it's really about the sequence comprehension's need for a well defined order.
Perhaps someone could sketch the revised grammar for Chapter 8 of the LRM (Bindings)?
bind = seq set bind | type bind ;
seq set bind = seq bind | set bind ;
seq bind = pattern 'in seq' expression ;
multiple bind = multiple seq bind | multiple seq bind | multiple type bind ;
multiple seq bind = pattern list 'in seq' expression ;
sequence comprehension = ‘[’, expression, ‘|’, seq set bind, [ ‘&’, expression ], ‘]’
This will affect the following constructs.
Luis, as owner of this, would you like to summarise what has been discussed so that we can quickly decide what to do at the meeting on Sunday?
Summarisation!
elems
and inds
operators (this is the do-nothing approach)in seq
and not in seq
really needed as operators?for
loop statements look with the new binding?bind = seq set bind | type bind ;
seq set bind = seq bind | set bind ;
seq bind = pattern 'in seq' expression ;
multiple bind = multiple seq bind | multiple seq bind | multiple type bind ;
multiple seq bind = pattern list 'in seq' expression ;
sequence comprehension = ‘[’, expression, ‘|’, seq set bind, [ ‘&’, expression ], ‘]’
Thanks Luis. And I think there is also the point about the ordering, when used in a sequence comprehension, following the order of the original sequence, and that being restricted to a single binding (not a multi-bind) but not necessarily of a single numeric value as it is currently.
I thought that had been decided already. But I will add it.
The LB has decided to proceed with the RM, which enters the discussion phase.
Based on today's meeting people generally seem to be supportive of the in seq
in sequence comprehension but less in favor of the in seq
operator.
I agree that using "in seq" inside a sequence comprehension would make it clearer so I am in favour of that proposal. However, I am against using this generally and I certainly don't see a pressing need for having it (and its negation) as binary operators.
The LB has decided to move this RM to 'Execution'.
In preparation for execution, the status of open questions is:
Are in seq and not in seq really needed as operators?
No. These will not be implemented.
How should for loop statements look with the new binding?
Loop statements will not be affected by this RM. They remain as before.
Any relevant POs or TCs?
Nick will investigate this through the VDMJ implementation and consult with the LB as things come up
There is a very early draft of this available for VDMJ. The syntax and type checking is reasonable. Simple sequence comprehensions look OK, but I haven't tried anything further as yet.
https://github.com/nickbattle/vdmj/releases/download/3.1.1-1/vdmj-3.1.1-RM36.jar
I think the semantics of "in seq" bindings are clear in all cases except the "awkward squad" - the last four cases listed above in Luis' summary:
First def statements. These can include set binds, but you end up with "def e in set S = n in...". This is hard enough to parse, let alone understand. It may just be that the grammar uses a patternBind because it makes sense for patterns and or type binds. But set binds (and now, sequence binds) don't seem to make sense. Not sure what to do here.
You get similar issues with the sequence for loop, and trap and tixe statements. They are defined with patternBinds and make sense with patterns and type binds, but set binds (and now seq binds) make less sense.
Does anyone have a view? We can perhaps discuss further on Sunday.
As far as I understood, sequence for loops are to remain unchanged.
As for the others, I'm not sure. If we're pragmatic, seq binds really make a difference for enumerations. But I would like for the language to be consistent. Is there anywhere that set binds are not allowed but other binds are?
value definitions optionally allow patterns to be typed, but their syntax is defined as pattern, [ ':', type ]
instead of pattern | type bind
. I rather wonder why it's not pattern bind
.
To me, it looks natural to allow set bind in def statements and any other definition-related constructs because sets are quite close concept with types.
Def statements with set binds always look very strange to me, like "def e in set {1,2,3} = 2 in ...". Internally, VDMJ uses this to decide that e is of type nat1, and when the definition is executed, it is bound to the value 2, after a check that 2 is a member of the set (which can be an expression, so it may change at runtime). The presence of the set bind dynamically limits the assignable value, so the definition "e in set S = v" is saying that v is guaranteed to be in S, as well as binding it to e.
Perhaps that has some specification value, but "def e:T = v in..." makes more sense to me, as does "def e = v in..." (type binds and patterns, respectively). I'm not sure whether the choice of "pattern bind" in the grammar is deliberate or not.
So I could extend this, and allow "def e in seq S = v in...", which would require v to be in S. Just checking back, because this is a awkward case internally as well :)
Sequence for loops were mentioned in your list, Luis. You can write "for e in set S in sequence do...". As above, what this does is limit the values from the sequence to be from the set - with a runtime error if they are not there.
Same story with trap statements: "trap s in set S with stmt in body" will cause a runtime error if an exception matching s does not exist in S. It is similar for tixe statements, except if the match to the set fails, other tixe alternatives are attempted (so no runtime error).
So should all of these get "sequence binding" extensions for RM36, or are they set-binds only?
I've drafted changes for these four last cases (defs, loops, traps and tixes). It wasn't too awkward, and the change has roughly the same semantics as "in set" - ie. if the value being matched to the pattern is not also in the sequence, we throw an error. I'll commit it later, if there are no objections.
I noticed one curious thing with tixes and "in set" binds, and what I said before isn't quite right. The tixe cases are tried in order as long as the exit code does not match the type of the pattern or match the pattern itself. But if there is a set binding, and if the value matches the pattern but is not in the set, this is a runtime error. I assumed it would try the other tixes (so you could divide up your processing by using the same pattern with different sets). But both VDMJ and VDMTools fail with a runtime error in this case.
We should presumably add POs for the case of failure to find the value in a set or sequence (in the cases that use pattern-bind, like "def s in S = v in..."). I think it's just "v in set S" or "v in set elems S", with the usual context around it.
I've created POs for the cases where there is a clear value in scope that must match the set/sequence in a bind. In the case of traps and tixes that's harder (the value is an exit code), but this isn't the only problem with statement POs! So the spec below now produces the POs shown:
functions
f: nat -> nat
f(n) ==
def s in seq [1,2,3] = n in
def x in set {4,5,6} = n in s+x
pre n > 0 and n < 4;
operations
op: nat ==> ()
op(n) ==
for s in set {4,5,6} in [7,8,9] do
for s in seq [10,11,12] in [13,14,15] do skip;
Proof Obligation 1: (Unproved)
f, s: sequence membership obligation in 'DEFAULT' (test.vdm) at line 4:32
(forall n:nat & ((n > 0) and (n < 4)) =>
n in set elems [1, 2, 3])
Proof Obligation 2: (Unproved)
f, x: set membership obligation in 'DEFAULT' (test.vdm) at line 5:36
(forall n:nat & ((n > 0) and (n < 4)) =>
(def s in seq [1, 2, 3] = n in
n in set {4, 5, 6}))
Proof Obligation 3: (Unproved)
op: set membership obligation in 'DEFAULT' (test.vdm) at line 11:13
(forall s in set elems [7, 8, 9] &
s in set {4, 5, 6})
Proof Obligation 4: (Unproved)
op: sequence membership obligation in 'DEFAULT' (test.vdm) at line 12:17
(forall s in set elems [13, 14, 15] &
s in set elems [10, 11, 12])
This has now largely been ported over to Overture 2.3.9-SNAPSHOT. I've been testing some Jenkins builds, with the help of PJ, Paul, Luis and Tomo, but feel free to try yourselves.
See http://overture.au.dk/overture/test/Build-62_2016-07-20_14-45/
Message changes for the LRM are now pushed in the documentation 'editing' branch.
The feature proposed in this RM is available in Overture version 2.4.0 onwards, and the LRM has been updated accordingly.
Identification of the Originator
Paul Chisholm
Target of the request, defining the affected components of the language definition
Affects type 'seq’ and its primitive functions.
Motivation for the request
Comprehensions allow bindings over sets or types. It is a common situation to want to create a comprehension that ranges over a sequence. While one can define functions in VDM-SL to achieve the same purpose, being able to specify comprehensions that directly range over sequences would result in specifications that are more immediate, clearer and concise.
Description of the request
Add primitive sequence functions and allow their use in bindings such that comprehensions can range over sequences directly.
description of the modification
Add the primitive functions ‘in seq’ and ’not in seq’ to type ‘seq’. Extend the definition of binding to include ‘in seq’.
benefits from the modification
Allows comprehensions to range over sequences directly which allows more immediate specification of functions over sequences. ‘in seq’ and ‘not in seq’ are available as primitive functions; no need for individual specifications to define them. At present one can use a combination of sequence indices and set binding. For example, the following example for the Fitzgerald/Larsen “Modelling Systems” book (section 7.4)
[ ms(i) | i in set inds ms & Classify(ms(i),cat) = <HI> ]
becomes the clearer and more immediate
[ e | e in seq ms & Classify(e,cat) = <HI> ]
possible side effects
None identified.
A test suite for validation of the modification in the executable models (if appropriate).
None