overturetool / language

Overture Language Board issue tracking
2 stars 0 forks source link

Append object pattern #25

Closed joey-coleman closed 10 years ago

joey-coleman commented 10 years ago

The following bug was originally reported on Sourceforge by shinsahara, 2013-12-05 04:01:32.138000:

1 Identification of the Originator Kei1.Satou@scsk.jp translated by ss@shinsahara.jp

2 Target of the request: defining the affected components of the language definition. Append object pattern to Patterns.

3 Motivation for the request Using the object pattern match, we can describe conciser specification.

4 Description of the request, including: (a) description of the modification;

    Syntax: object pattern = 'obj_', name, '(', [ field pattern list ], ')';
        field pattern list = field pattern, { ',', field pattern };
        field pattern = name, '|->', pattern;

     note:
          If there is no prefix keyword, it is very difficult to analyze syntax.
(b) benefits from the modification; 
      Now, we have to write following:
    cases true:
     (isofclass(P, p) and p.name = "tom" and isofclass(C, p.fld4) and p.fld4.fld2 = "abc")
         -> p.age,
      others -> 0
    end
    Using the object pattern,  we can write as following conciser specification:
    cases p:
      obj_P(name |-> "tom", age |-> a, fld4 |-> obj_C(fld2 |-> "abc")) -> a,
      others -> 0
    end
(c) possible side effects. 
        nil

5 Source code and technical documentation where applicable

     Syntax: ObjectPattern :: cls : Name
           fields : seq of FieldPattern
           CNTXT
           ;
     FieldPattern :: nm : Name
           pat : Pattern
           CNTXT
           ;
(a)source
MatchObjectPattern : STKM`ObjectPattern * SEM`VAL ==> set of SEM`BlkEnv
MatchObjectPattern ( mk_STKM`ObjectPattern(cls, fields_lp,-), val_v) ==
    if is_SEM`OBJ_Ref(val_v) and STATE`IsAClass(cls) then
     let objnm = val_v.tp in
       if (objnm = cls) or STATE`IsSubClass(objnm, cls) then
           (dcl pat_lp : seq of STKM`Pattern := [],
          val_lv : seq of SEM`VAL := [];
          for mk_STKM`FieldPattern(field, pat, -) in fields_lp do
             (pat_lp := pat_lp ^ [pat];
             val_lv := val_lv ^ [EXPR`EvalFieldOBJRefSelect(val_v, field)]);
          return MatchLists(pat_lp, val_lv))
       else return {}
     else return {};

6 A test suite for validation of the modification in the executable models. The result of the test case written after “--“ are followings:

d new A().Test()
--[ 15, 15 ]

cr p := new P(“Peter", 20, <M>, nil)
print let obj_P(name |-> n) = p in n
--"Peter"

class A
operations
public
Test: () ==> seq of nat
Test() ==
def p = new P("tom", 15, <M>, new C(3, "abc")) in
  return [
    cases p:
      obj_P(name |-> "tom", age |-> a, fld4 |-> obj_C(fld2 |-> "abc")) -> a,
      others  -> 0
    end,
    cases true:
      (isofclass(P, p) and p.name = "tom" and isofclass(C, p.fld4) and p.fld4.fld2 = "abc")
              -> p.age,
       others -> 0
    end
  ];
end A

class P
instance variables
public name : seq of char := "";
public age : nat := 0;
public sex : [<M> | <F>] := nil;
public fld4 : [C|nat] := nil;
operations
public
P: seq of char * nat * [<M> | <F>] * [C] ==> P
P(n, a, s, c) ==
 (name := n;
  age := a;
  sex := s;
  fld4 := c;);
end P

class C
instance variables
public fld1 : nat := 0;
public fld2 : seq of char := "";
operations
public
C: nat * seq of char ==> C
C(f1, f2) ==
 (fld1 := f1;
  fld2 := f2;);
end C
joey-coleman commented 10 years ago

Comment by shinsahara, 2013-12-05 04:50:25.207000:

  • Description has changed:

Diff:


--- old
+++ new
@<span></span>@ -1,30 +1,29 @<span></span>@
-1. **Identification of the Originator**
+1 **Identification of the Originator**
     Kei1.Satou@<span></span>scsk.jp
     translated by ss@<span></span>shinsahara.jp
-+  **Target of the request:** defining the affected components of the language definition. 
+2 **Target of the request:** defining the affected components of the language definition. 
     Append object pattern to Patterns.
-+  **Motivation for the request**
-    Using the object pattern match, we can describe conciser specification.  
+3 **Motivation for the request**
+        Using the object pattern match, we can describe conciser specification.  
-+  **Description of the request, including:**
+4 **Description of the request**, including:
     (a) description of the modification; 
-~~~~~~
-          Syntax: object pattern = 'obj_', name, '(', [ field pattern list ], ')';
-        field pattern list = field pattern, { ',', field pattern };
-        field pattern = name, '|->', pattern;

@@ -127,7 +130,6 @@ P(n, a, s, c) == (name := n; age := a; -~~ sex := s; fld4 := c;); end P @@ -144,4 +146,4 @@ (fld1 := f1; fld2 := f2;); end C

+~~

joey-coleman commented 10 years ago

Comment by shinsahara, 2013-12-05 04:51:04.510000:

  • assigned_to: Shin Sahara
  • private: Yes --> No
joey-coleman commented 10 years ago

Comment by nick_battle, 2013-12-05 09:07:25.288000:

Shouldn't the syntax be "'obj_', identifier" rather than name?

I see that the field patterns map names to patterns, so I could say "obj_C(A`x |-> 1)", if A was a baseclass of C? Is that as intended, or should these names also be identifiers?

There is no comment about private or protected field access. Is the intention to ignore field access permissions, or is it illegal to use obj_C with anything other than a public field?

I assume there is no restriction on whether or not a field is static?

It would be an error to use the same field name in two or more field patterns?

I had a slightly uneasy feeling when I first read this because I think of pattern matching as something that happens with pure values rather than state (though those pure values may have come from state variables). But an object is not a pure value - it is always state. For example, if I say "mk_R(x) = r", where r is a record instance variable, then the semantics takes a snapshot of the r value and uses it to match; it does not matter if r changes during this process, because a pure value snapshot has been taken at a point in time. But if I say "obj_C(x) = c", does that mean that there is a snapshot of c taken, so that the match is unaware of state changes that may occur within the object (on another thread)? Put another way, can this ever be false:

let mk_R(obj_C(x), obj_C(y)) = mk_R(a, a) in x = y;

If we write the same expression with some other pattern in place of the obj_Cs, then I think this can never be false because the value of "a" is read and isolated before the pattern match is performed. But you can't do this with an object.

joey-coleman commented 10 years ago

Comment by nick_battle, 2013-12-05 09:49:22.734000:

A better example might be this:

let v = objInstVar in
    let mk_(obj_C(f |-> x), obj_C(f |-> y)) = mk_(v, v) in x = y

Here we only have one value, v, so it cannot change in the creation of mk_(v, v). So patterns which match the same part(s) of v are normally guaranteed to bind the same values. But if v is an object reference, this is not guaranteed unless we define something about snapshots for object values.

joey-coleman commented 10 years ago

Comment by nick_battle, 2013-12-15 10:44:20.498000:

Regarding the access of fields and whether object patterns are limited to public fields, I've attached a short spec to remind us of the access rules in VDM++ - both VDMJ and VDMTools honour these rules.

I think this means that the object pattern could simply be subject to the same rules - eg. it is possible to match private field values of an object, but only from within methods of the class of that object.

Files were attached:

joey-coleman commented 10 years ago

Comment by shinsahara, 2014-03-30 04:11:23.204000:

Abstract of dr.’s answer are followings:

(1) objname is bad. ‘obj’, identifier is: Syntax: object pattern = 'obj_', identifier, '(', [ field pattern list ], ')'; field pattern list = field pattern, { ',', field pattern }; field pattern = identifier, ‘|->', pattern;

(2) Can access public identifier in Object. (dr.k forgot write description)

(3) Restrict object reference in multi thread environment.

joey-coleman commented 10 years ago

Comment by nick_battle, 2014-03-30 07:56:49.855000:

Re:(2), do you mean access is limited to public fields always, or can you match on private fields if you are in an operation defined within the class that you are matching on - ie. can an operation in class C match on the protected/private fields of C, or only its publics? (if not, why not?)

Re:(3), do you mean it would be a runtime error to use an object pattern in a multithreaded environment with more than one thread or a compile time error to use with a spec that contains thread definitions or perhaps are object matches implicitly mutually exclusive? What does this mean in an RT environment (some threads are implicit in RT, as when making async calls).

joey-coleman commented 10 years ago

Comment by kgpierce, 2014-03-30 10:01:47.805000:

Nick needs information from VDMTools on this. The RM is on hold until the LB discusses how to have better dialogue between the two tools developers.

joey-coleman commented 10 years ago

Comment by nick_battle, 2014-04-02 13:14:16.783000:

Looking at the latest 9.0.4 build (28th Feb), I see the following:

  1. obj_C is indeed parsed using an "identifier" rather than a "name".
  2. Similarly, the name of the field pattern is an identifier, not a name.
  3. Access from a separate test class only works with public fields. Private and protected fields give runtime errors (Instance variable is not in scope), rather than type checking errors.
  4. Access from a test subclass works for public and protected fields. Private fields give runtime errors.
  5. Access from the same class works for public, protected and private fields.
  6. Binding two different fields to the same receiving pattern does not match when the field values are the same. For example, "obj_A(pub |-> a, priv |-> a) = new A()" does not match, even if the pub and priv fields have the same value. Perhaps this is a bug?
  7. Non-existent fields are picked up by the type checker.
  8. Fields may refer to inherited names from a superclass (but only using an identifier, not a qualified name (ie. no hidden fields)).
  9. If you match a superclass value against a subclass pattern, you get runtime errors for any extra field patterns (not type checker errors)
  10. If you match a subclass value against a superclass pattern, you get type checker errors for missing fields
  11. Matching works with static instance variables, though there appears to be no access limits on statics (public, private and protected are all visible and work at runtime from any class). Perhaps this is a bug?
  12. It is not an error to match the same field to more than one pattern. So "obj_A(pub |-> a, pub |-> b) = new A()" creates "a" and "b" with the same value
  13. I've tried running a concurrent object update with two object matches, but there is never a different pattern match - at least not while allowing the loop to run for about a million attempts. Presumably this is a coincidence of scheduling rather than some sort of deliberate defence in the language?
joey-coleman commented 10 years ago

Comment by nick_battle, 2014-06-29 17:28:47.996000:

(Email from Shin)

Hi Nick,

I got mail from dr.k.

2014/06/24 22:37、Nick Battle nick.battle@<spangmail.com> のメール:

Hi Shin,

​We have another meeting on Sunday. Is it possible to get any more information from Dr.K before then?​

2014/05/06 15:55、Nick Battle <nick.battle@<span></span>gmail.com> のメール:
​Regarding this, perhaps there is time to get feedback (via Shin) from Dr. K regarding the earlier email I sent about this. I've reproduced the points here for convenience:​

​Looking at the latest 9.0.4 build (28th Feb), I see the following:

1. obj_C is indeed parsed using an “identifier" rather than a "name".

Yes, obj_C is identifier.

Syntax: object pattern = 'obj_', identifier, '(', [ field pattern list ], ')'; field pattern list = field pattern, { ',', field pattern }; field pattern = identifier, ‘|->', pattern;

2. Similarly, the name of the field pattern is an identifier, not a name.

Yes.

3. Access from a separate test class only works with public fields. Private and protected fields give runtime errors (Instance variable is not in scope), rather than type checking errors.

4. Access from a test subclass works for public and protected fields. Private fields give runtime errors.

5. Access from the same class works for public, protected and private fields.

6. Binding two different fields to the same receiving pattern does not match when the field values are the same. For example, "obj_A(pub |-> a, priv |-> a) = new A()" does not match, even if the pub and priv fields have the same value. Perhaps this is a bug?

7. Non-existent fields are picked up by the type checker.

8. Fields may refer to inherited names from a superclass (but only using an identifier, not a qualified name (ie. no hidden fields)).

9. If you match a superclass value against a subclass pattern, you get runtime errors for any extra field patterns (not type checker errors)

10. If you match a subclass value against a superclass pattern, you get type checker errors for missing fields

11. Matching works with static instance variables, though there appears to be no access limits on statics (public, private and protected are all visible and work at runtime from any class). Perhaps this is a bug?

12. It is not an error to match the same field to more than one pattern. So "obj_A(pub |-> a, pub |-> b) = new A()" creates "a" and "b" with the same value

Yes, he forgot access permission. He’ll obey your opinion.

13. I've tried running a concurrent object update with two object matches, but there is never a different pattern match - at least not while allowing the loop to run for about a million attempts. Presumably this is a coincidence of scheduling rather than some sort of deliberate defense in the language?​

dr.k thinks that we can’t use object pattern in multi thread environment.

Cheers, Shin

joey-coleman commented 10 years ago

Comment by nick_battle, 2014-06-29 17:29:38.415000:

(Email from NB)

Hi Shin,

​Thanks for the translations!​

On 25 June 2014 07:15, SaharaShin shinsahara@<spanmac.com> wrote:

Yes, obj_C is identifier.

​OK, that's fine. Similarly with the field name.​

Yes, he forgot access permission.
​ ​
He’ll obey your opinion.

​OK, I assume that means the permissions will be the same as regular field access rights, so if you could access "obj.f" in some situation, then you could also successfully use "f" in an object pattern in the same location.​ I think we need to have static type checking errors rather than runtime errors, of course.

dr.k thinks that we can’t use object pattern in multi thread environment.

​This seems strange to me, though clearly it avoids the problem I was worried about. Does he mean it would be a runtime error if there was more than one thread executing (rather than blocked), or a static error if the specification has thread definitions?

​It would be possible to do an "atomic" ​match of object values to a pattern, such that we can guarantee that there are no thread swaps during the evaluation of a pattern match. We do this currently during pre/postcondition/invariant evaluations, which take place in zero time (for RT) and without thread swaps.

It still worries me that we don't have any solid understanding of what happens in VDM++ when you pass an object reference to a piece of specification that deals with immutable values, like a function parameter. For example:

class A
functions
    public first: seq of S -> S
    first(s) == hd s;

end A

​If​ S is a record type, this is a pure function and there is no issue: the return value is always the same as the value of the head of the sequence passed in. But if S is a class and we have multiple threads, then the "value" of the object passed back from the function may be a different value (nin terms of its state content) to when it was passed in - which seems to violate functional integrity (they must always return the same result for the same arguments).

​But we don't say that you can't call functions in a multi-threaded environment.​ Nor do we say that function application is atomic, usually, though there is the strange case of post_op functions that have to reason about what you might call "self" and "self~". The signature of a post_op function is not defined for VDM++, but it (presumably) involves passing object references. The VDM-SL post_op functions pass the immutable record value of the Sigma before/after states (that may even be why the Sigma record type was required!)

I know all this is difficult to solve for, but if we blunder into an inappropriate solution for object patterns, we might be making the whole situation worse for VDM++ generally.

(Copied PGL as we were discussing something similar, recently).

-- Cheers, -nick

joey-coleman commented 10 years ago

Comment by nick_battle, 2014-06-29 17:31:25.922000:

(Email from PGL)

Dear all, I would like to suggest that we aim for making the interpretation of functions atomic. If this is done we can have a clean access to objects. This could apply both to the suggested object pattern as well as enabling one to lookup a public instance variable from an object passed as a parameter for a function. As far as I can see the only potentially negative consequence of this would be that the time granularity in VDMRT would be slightly less fine grained but I don’t see a problem with this.

Cheers, Peter

joey-coleman commented 10 years ago

Comment by nick_battle, 2014-06-29 17:33:24.616000:

Note that in VDMTools, the following function can access the state of the object reference passed in, via the object pattern:

class A
instance variables
  public iv:nat := 123;
end A

class B
functions
  public test: A -> nat
  test(a) ==
    let obj_A(iv |-> x) = a in
  x;
end B

Initializing specification ... done
>> p new B().test(new A())
123
joey-coleman commented 10 years ago

Comment by nick_battle, 2014-07-18 07:54:02.766000:

(From an email discussion between Nick and Tomohiro)

If the language specification prohibits accesses to fields of object references in a functional context, object patterns shouldn’t appear there. That’s a consequence of the new RM. ​ ​ If the language specification does not make it clear, I think this is a good opportunity to think about it. (maybe another RM?)

​Yes, that may be the simplest solution here - I would say the RM just needs clarification, rather than a new RM.​

With regard to the atomicity of the matching, I agree with you. If an object pattern contains an operation call, the atomicity becomes messy. I don’t like it. I think the order of evaluation should be explicitly explained instead.

​That would be nice, but we don't have a deterministic definition of the thread scheduler. The only option we have (for certainty) is to state that patterns are applied atomically, but that is ugly too, as discussed.​

The graphic image of a pattern gives mental impression of “snapshot” of a state. I think we need to tell the readers of the language manual that matching process is sequential and therefore there could be concurrent interference.

​Yes, I think that's where we end up.​

​So we're saying "no" to object patterns in functions (or function contexts, like pre/postconditions etc), and "yes" to them in operations, but with a caveat that the fields being matched can change, and that (what?) a compound pattern application may find that the value changes from match to match even within one expression.​

joey-coleman commented 10 years ago

Comment by tomooda, 2014-07-18 09:22:19.618000:

So we're saying "no" to object patterns in functions (or function contexts, like pre/postconditions etc), and "yes" to them in operations, but with a caveat that the fields being matched can change, and that (what?) a compound pattern application may find that the value changes from match to match even within one expression.​

Sounds reasonable. A language spec doesn't guarantee atomicity with respect to concurrency, and a tool MAY perform an object pattern matching without context switching.

joey-coleman commented 10 years ago

Comment by tomooda, 2014-07-27 23:10:57.365000:

Sahara-san will post a new RM. The change will be the following. (the third bullet in the 4.a's note)

kgpierce commented 10 years ago

The LB voted to move this to the community discussion phase. Peter J will send an email to core soon.

pglvdm commented 10 years ago

I fully support this RM.

nickbattle commented 10 years ago

While implementing this, I've come across a problem with proof obligation generation. In some circumstances, the POG needs to generate an expression that would match a pattern, usually parametrized by the variables in the pattern. But we cannot easily do this with object patterns. For example, the following spec:

class A
instance variables
    iv : seq of nat := [123];

operations
    public op: A ==> nat
    op(obj_A(iv |-> [a])) ==
        return a;

end A

Produces the following PO in VDMJ. For now, I'm inserting a "new" expression to generate a value which might be able to match the pattern. But this assumes that there is a constructor with parameters that match the obj_ pattern's fields, which is not necessarily the case. The equivalent for a mk_R expression is much simpler, by comparison, since we can generate a mk_R expression and set the field values using the patterns directly.

Proof Obligation 1: (Unproved)
op(A): operation parameter patterns obligation in 'A' (test.vpp) at line 6:12
forall arg1:A &
  (exists bind1:A, a:nat & (arg1 = bind1) and (new A([a]) = bind1))

Unfortunately, VDMTools does not produce any proof obligations in this case. But presumably it would suffer from the same problem if it did.

Any thoughts on how to solve this one?

peterwvj commented 10 years ago

Yes that's a problem. Then we would need something like object initialisers as seen in C# http://msdn.microsoft.com/en-us/library/bb384062.aspx

pglvdm commented 10 years ago

Well as I see it the alternative would be to write:

Proof Obligation 1: (Unproved)
op(A): operation parameter patterns obligation in 'A' (test.vpp) at line 6:12
forall arg1:A &
  (exists bind1:A, a:nat & (arg1 = bind1) and ([a] = bind1.iv))

and then just explain that the assumption is that visibility of the instance variable is overrules in such a PO. Otherwise the way you have it now should also work.

nickbattle commented 10 years ago

At least the way I have it now would work if you had an appropriate class constructor, though it would probably be too much to insist that use of obj_A(f1 ..., f2..., f3...) requires there to be a constructor A(T1, T2, T3), just so that POs can be created.

I'm reluctant to invoke magical access "overruling" in POs. Though we could say that the use of a field in an obj_ pattern requires that field to be public, so that access is allowed.

tomooda commented 10 years ago

I don't think a simple new A(...) would work even if there's such a constructor because we'll need to cover all subclasses of A as well as A itself.

nlmave commented 10 years ago

Closed by LB, see http://wiki.overturetool.org/index.php/Minutes_of_the_LB_NM%2C_26th_October_2014