overturetool / language

Overture Language Board issue tracking
2 stars 0 forks source link

"Pure" operations called in functions #27

Closed joey-coleman closed 9 years ago

joey-coleman commented 10 years ago

The following bug was originally reported on Sourceforge by ldcouto, 2014-07-28 06:14:14.783000:

1 Identification of the Originator Luis Diogo Couto

2 Target of the request: defining the affected components of the language Introduce "pure" operations and allow them to be called inside functions. Pure operations do not affect the state of any object.

3 Motivation for the request When using an object as a function parameter, it is often necessary to query the internal state of the object. A typical example is when the object is used as an operation parameter and its internal state needs to be inspected in the pre condition.

The only way to do that at the moment is to make the relevant instance variables public, which breaks encapsulation.

4 Description of the request, including: (a) description of the modification: Allow pure operations to be called inside functions. A pure operation does not affect the state of its object nor of any other. It is referentially transparent.

Pure operations can only read instance variables and call functions and other pure operations.

A new keyword "pure" can be introduced to identify these operations.

Another option is to use the externals clause. An operation that contains ext rd x is pure. Currently such an operation can call any operations on x so the semantics of ext rd would be altered.

(b) benefits from the modification: When using an object as a function parameter it would be possible to check (or get) data that depends on the internal state of the object. This could be done without exposing the internal state of the object, which is an important principle of OO design.

This would be particularly helpful if we want to use objects in pre/post conditions and invariants, which are signature features of VDM. Pure operations would be particularly suited to implement the checks that these conditions rely on.

(c) possible side effects: If the new keyword is introduced, current models that use it as a name will no longer be correct. For what it's worth, none of the overture examples use the word pure as a name.

If the semantics of ext rd are altered, models that use it may no longer be correct. 11 examples use ext rd, though I have not checked the bodies of those operations.

5 Source code and technical documentation where applicable Documentation: new keyword or updated semantics of ext rd need to explained in the language manual.

Source code: the new keyword would require changes to the parser. Both versions would require changes to the type checker.

6 A test suite for validation of the modification in the executable models. N/A

nickbattle commented 9 years ago

Well (if I understand you), it might be acceptable for us to limit folk to pre/postcondition quotation rather than other operations. But we would need to be sure that it ought always to be possible to fix your class to get the result you wanted. That might be true, but I'd have to think about it :)

The pre/post functions include the "normal" parameters of the operation they constrain as well as their own "state" parameters. But the pre/post_op function body cannot access operations or state variables of the objref passed in, because that's "state". The general "fudge" to convert the state parameters to local variables does not (currently) extend to objref parameters, since they are not "local state" but "passed state".

It's just a mess isn't it? :-(

ldcouto commented 9 years ago

It is less than ideal, yes.

I'm conflicted about this. On the one hand, we could focus on defining the user-facing stuff (intended behavior, syntax, etc.). On the other, we really need a solution that is reasonably implementable so maybe this should be somewhat driven by tool constraints.

kgpierce commented 9 years ago

Yes, we're in a bit of a pickle, but this discussion is certainly helping me understand. I think it's important that both the solution makes sense and is implementable, and I think we're going to have to keep both in mind as we go.

So here's a simple idea to discuss Luis' "objects as parameters" point:

class Duration

instance variables
private time: nat;

operations
GetHours: () ==> nat
GetHours() == return time div 60;

GetMinutes: () ==> nat
GetMinutes() == return time mod 60;

end Duration

 

class Test

instance variables
private duration: new Duration()

operations
OnlyWorksIfDurationLessThanOneHour: Duration ==> ()
OnlyWorksIfDurationLessThanOneHour(d) == ...
pre d.GetHours() = 0   

end Test

Now currently this isn't legal. We can currently make time public, then perform the calculation directly in the pre-condition, or we could follow (one of) the proposals that allows calls to objects in pre-conditions.

ldcouto commented 9 years ago

Well, I would say making time public kinda defeats the point of the Duration class. But at least the functionality is there for a user that really needs it so that's good.

Personally, I'm a bit of a fundamentalist on public instance variables -- they are evil. But even if we say that they are not such a big deal in VDMpp models, there's still a few issues. Namely, the client of the class has to write a lot of redundant code. For example, pre d.time mod 60 = 0.

It was also suggested that we allow instance variables to only be visible to pre/post conditions -- the concept has some relation to logic variables in other formalisms.

kgpierce commented 9 years ago

Yes, I agree public here is a bit silly, but technically doable. I'd just like to use it to explore questions.

The thing with this example is Duration is so simple that there isn't a clear argument for not using a record (which are always readable), although I suppose when going to Java or similar, everything becomes a class.

kgpierce commented 9 years ago

I just chatted to Cliff and will chat further on Thursday. His general view is that abstraction and refinement is the key to avoiding these issues (and to me the "what were they thinking!?" of VDM++ comes into play).

For example, let's say you want to work with a linked list made objects and pointers. This is "horribly messy" in Cliff's words, but the abstraction above that is a sequence. So you make a clean specification based on sequences, then reify (refine) down to a more concrete version that uses objects and pointers. If the refinement is valid, you don't have to worry about making pre-post assertions.

kgpierce commented 9 years ago

Anyway so I interrupted my train of thought here, but I think if the class is so simple as to just be for holding data, then you don't need the OO. If the class is more complicated and you need to call operations on the objects, then calling pre- and post- on the object might be enough.

ldcouto commented 9 years ago

Right, but if a (complex) object is used as an operation parameter, how can we assert over it?

On the matter of refinement, how does it approach encapsulation? Dumb example but say the above list must always be ordered. It it's a list of records, we can write an invariant that uses one of the fields of the the record. But once we move to a list of objects how does it go? Can the refinement obligations see "inside" any object?

nickbattle commented 9 years ago

I'm not sure whether Cliff has spotted something fundamental, or whether he's cleverly ducking the issue. Perhaps both :-)

I think it's a good point that if you correctly (sic) refine a working specification in SL into a more concrete PP specification, then all the constraints (pre/post/inv) that you defined for the abstract spec will still hold for the refinement, even if they are not explicitly included in the specification. But we don't routinely refine basic SL specs to PP - in fact I'm not aware of a refinement theory that would enable that (I suspect PP semantics are too weakly defined to produce one). What we're trying to do is to write PP specifications that are not subsequently refined, and which contain their own constraints. These constraints are then tested by ad-hoc animation or combinatorial testing.

I think most people do and will use PP that way, and this RM is pointing out that there are problems with that (suggesting a solution). We can't really recommend that people avoid these problems by writing a more fundamental specification and then refining it - even if we did have a theory for such refinements from SL into PP. We have to repair PP somehow, in my view.

kgpierce commented 9 years ago

It was only a 5 minute chat to set up another meeting. Cliff just emailed me: "Is “operation quotation” an official part of the language?"

I'll check what he means, but I think he'll have some insight into our problem as well. I'll know more tomorrow morning after we chat.

nickbattle commented 9 years ago

The only time I've heard "quotation" used like that is when we're talking about explicitly invoking the pre/post condition test of an operation - ie. calling pre_op or post_op (or inv_T). My understanding is that these are an official part of SL (they were covered in JF's thesis). But of course we're struggling with them in PP, where they're not defined. So the answer depends on which language he's talking about :-)

kgpierce commented 9 years ago

"It’s also covered in a well-known VDM book (http://homepages.cs.ncl.ac.uk/cliff.jones/publications/Jones1990.pdf)

My point is that one can write a purely logical check for the non-empty stack as:

pre-POP(s)"

nickbattle commented 9 years ago

OK. And depending on whether pop() is passed a stack or the stack is in the "state" definition, the "s" parameter here is either the stack value to process (as a pure value), or if the pop operation has no parameter, then "s" is the mk_Sigma state record value that contains a stack to be popped. So in one case we pass state to the pre_pop(s) function, and in the other we pass a pure stack value to it. But in both cases, the function is pure and does not access the real state (just the values passed to it).

So how does that work in VDM++?

ldcouto commented 9 years ago

Are you sure that's how it goes Nick? I thought pre_pop always needed to be passed the state, regardless of any parameters to the operation.

Regardless, in VDMpp, I think we should leverage encapsulation and do away with state passing altogether. When you invoke a pre-condition, you just pass the operation parameters to it. The state can be worked out from the object receiving the message.

nickbattle commented 9 years ago

​Well, strictly the pre_op and post_op only have Sigma parameters in SL if the model has a state definition. Since Cliff wasn't clear what the spec was when he mentioned calling "pre-POP(s)"​, we don't know whether "s" is an argument in a stateless model or the state passed to an operation with no parameters. But this is just nit-picking; I think Cliff's point was well made.

I understand the suggestion, to make pre/post_op operations in effect. I don't yet understand the ramifications, though we looked at that at the start of this RM. We need to define whether pre/post_op operations can block, can affect history counters, can take time (in RT), can write to instance/static data, can spawn or stop threads, can call other operations and presumably other things I've not thought of yet! I'm hoping that we can achieve a set of rules for them which means that a simulation with pre/post checks disabled is exactly the same as one with them enabled (this is the case with pure function pre/post_op). If we can't achieve that, I would be very worried.

ldcouto commented 9 years ago

I totally agree with the second point, with some caveats. At the moment, it's quite easy to write a spec that crashes with checks enabled and succeeds with them disabled. Those kinds of specs will remain possible.

I think we are mostly in agreement on the possible solutions so maybe we should re-summarize and then move on to choosing one.

nickbattle commented 9 years ago

If you can show me a spec that behaves differently with and without pre/postconditions or invariants enabled, I'd be interested to see it. This is a bug, surely? The intention is that the behaviour is identical.

ldcouto commented 9 years ago

Well, it's kind of cheating ^^. But if the condition is undefined, the tool will crash. Try any operation with a pre-condition of x/0 = 0 for example. The same obviously holds for any condition that is false.

But the actual approach we use to OO state is irrelevant to this kind of thing. I was simply trying to say that the new approach should keep things as they are now. Any specification that previously behaved the same, should continue to do so. It's just that a few of them don't.

--ldc

On 3 December 2014 at 21:12, Nick Battle notifications@github.com wrote:

If you can show me a spec that behaves differently with and without pre/postconditions or invariants enabled, I'd be interested to see it. This is a bug, surely? The intention is that the behaviour is identical.

— Reply to this email directly or view it on GitHub https://github.com/overturetool/language/issues/27#issuecomment-65480993 .

nickbattle commented 9 years ago

OK, I see what you mean, but that's as expected. I'm only saying that behaviour with and without constraints should be the same if the constraints are/would be met :-)

I think we need to try very hard to preserve this "transparency" in whatever solution we adopt. That means that pre/post_op operations (and operations they call, etc) can do nothing that affects the behaviour of the rest of the specification. That's easy for functions, but quite limiting for operations.

And even if we solve this for pre_ops, we still have to enable post_ops to read two versions of the state somehow.

kgpierce commented 9 years ago

Hi, I chatted to Cliff again. He's keen to help if he can, however the best next step for him would be for me to go back with an example that we can chew on. Ideally something with an object chain that can demonstrate if the quoting style of VDM-SL is sufficient in PP.

kgpierce commented 9 years ago

A summary of the above discussions and proposed solutions is available here: https://github.com/overturetool/language/wiki/Summary-of-RM-27-Discussion

peterwvj commented 9 years ago

Based on Sunday's core and LB NM it was my impression that most people are in favour of a solution based on proposal one or two. If we choose to pursue one of these solutions I personally prefer using a keyword for declaring the operation pure so the intention of the operation is explicitly stated. This will also simplify static analysis.

One thing that I am a bit worried about is that we would allow a function, which is static by definition, to invoke an operation, which may not be static. Semantically speaking this does not make much sense.

It may not be the biggest concern, but this would not be possible to code generate, because a language like Java does not allow this (for a good reason).

So eventually we may need to make pre and post conditions of operations, operations too. I know it's not very good wrt proof theoretic work, but I don't really see a way around this.

ldcouto commented 9 years ago

I am, and have always been, in favor of the "pure operation" solution. Also, like @peterwvj says, a keyword has the advantages of making it explicit and easier to analyse.

Regarding the static issue, the function definition of the condition has access to the state as arguments so it can call operations on that. Also, a condition already has access to instance variables which are not always static either. Anyway, there's more issues with conditions in VDM++ and we can get to them afterwards. But we need to take things a step at a time.

peterwvj commented 9 years ago

Yes that's true, Luis - thanks for pointing that out - instance variables are passed via a map to the function. Returning to the keyword thing. Other than restricting pure operations to only invoking functions and pure operations, we should also require an operation overriding a pure operation to also be declared pure to simplify the static check for purity. Otherwise, as we talked about today, a static might be impossible for a model with dynamic dispatching.

I agree that we should take small steps. Yielding a runtime error if a pure operation modifies the state might be sufficient as a first step. Eventually, though, I think it would be worthwhile reporting potential violations statically.

nickbattle commented 9 years ago

I'm glad that we're still worrying about the bigger pricture here. It looked like there might have been a mad rush to allow operation calls in pre/postconditions without worrying too much about the consequences.

I'm relatively happy with making small steps like this, as long as we note the remaining problems and as long as we think that the small step is in the right direction!

As I see it, if we allow operation calls (with all the rules and POs to constrain them), then two things remain. First, we have to decide whether this is done by crudely promoting the pre/post functions to be operations (which will "just work", but which is a big concern from a theoretical POV), or whether they remain functions and we pass the state to them (which is difficult for complex states). Secondly, we have to decide how this works for "old" state~ values - though if we can figure out how to pass state to a function, this can presumably follow the same pattern.

nickbattle commented 9 years ago

Incidentally, are we saying that a pre/postcondition can call "op()" as well as "iv.op()"? It's just that the former might complicate the "static function" issue - there is no self in a function.

peterwvj commented 9 years ago

Just a question: Are old names suppose to stay limited to identifiers only - at least they are in Overture. I mean currently you can do iv~ but not iv1.iv2~. So it seems like we only need to take old values reachable from self into account. Please correct me if I' m wrong.

nickbattle commented 9 years ago

According to the grammar, old values are only defined for simple identifiers, yes. That's how it is in VDM-SL, where state data can always be referred to by a simple identifier (it's the same for the names in an "ext" clause). If we take that literally in VDM++, it would mean that we can only refer to the old values of our own local instance variables (which can be addressed by name).

Whether that's a deliberate limitation by the VDM++ language designers, or something that they just didn't think about, I cannot say!

In your example, you could perhaps refer to iv1~.iv2. So although we may only need to worry about old names for the "roots" of old expressions, we still need to provide the state data for what those roots refer to, which may be an arbitrary tree of objects.

tomooda commented 9 years ago

Peter, I think it depends on if we need a post_op quote or not. If we do post_op quotes, we need to keep old values other than its own inst vars. (otherwise we can't evaluate a post_op operation/function of iv's operation in its client's post clause in a meaningful way.) If we don't need post_op quotes but just want to evaluate the post clause after each evaluation of the operation, I think the situation will be pretty simple as you pointed out.

ldcouto commented 9 years ago

I think the issue of condition quotes is separate (but related) to having operations inside POs. I think condition quoting in VDM++ is not super clear (the language manual actually only mentions SL in page 96).

So maybe, we remove the feature entirely from VDM++. It also gewts around all the issues with passing state.

nickbattle commented 9 years ago

VDMTools does not generate pre_op and post_op functions for operations, presumably for the reasons we're discovering. I attempted to provide something in VDMJ, originally passing self and self~ parameters, but it has always been very limited (shallow copies, for example). The "map" solution for post_op didn't improve things much either.

So yes, perhaps we should remove pre_op and post_op as part of the interim solution (allowing op calls).

tomooda commented 9 years ago

Nick, I think a kind of [persistent data structure](Persistent data structure) is what we need with regard to preserving old states.

Besides the design of cloning algorithm, we'll need to re-design the semantics of equality about old states. The current definition of object's equality says iv~ = iv if the iv is not modified as a reference, but we want them not replaceable. We need iv and iv~ somewhat different.

I think this is a interesting topic, but will be a long shot. So, I agree with removing pre/post quotation from this particular context.

peterwvj commented 9 years ago

I think it sounds like a reasonable solution/trade-off to make (getting rid of pre/post quotation from this context that is). So as far as I understand this solves the problem we have with regards to passing old state, right? Do you see other things that we need to carefully consider?

peterwvj commented 9 years ago

Below summarises the findings from today's meeting regarding RM #27. The proposed solution is heavily based on solution one from the RM #27 note (https://github.com/overturetool/language/wiki/Summary-of-RM-27-Discussion).

The goal is to implement the current proposal for RM #27 in the tool and improve it in later releases:

  1. New keyword and runtime errors
  2. Type checking
  3. POs
  4. etc....
nlmave commented 9 years ago

Note: related "bug report" see thread https://github.com/overturetool/overture/issues/454

nickbattle commented 9 years ago

I think the #454 bug report has expanded into a proposal that is really an extension of this RM. See https://github.com/overturetool/overture/issues/454#issuecomment-117139098.

This seems compatible with the proposal we have here, but extends it to cover direct instance variable access from functions (therefore including invariants and pre/postconditions).

One slight concern I have is that this implies all function calls are atomic (no thread swaps), otherwise the value of the instance variables could change during the function. I'm not sure whether this atomicity would have implications for "functional style" specs that are also multi-threaded.

nickbattle commented 9 years ago

I've implemented something to address the concerns in issue #454 (in the ncb/development branch).

nickbattle commented 9 years ago

A few observations while trying to implement this:

peterwvj commented 9 years ago

Wrt. 1) I think it is reasonable not to permit pure constructors since they are intended to initialise state.

Wrt 2) Although I think it's possible to allow overloads to vary in terms of purity (since they have different type qualifiers) I think it will be of very little value. Especially considering that pure operations will mostly be getters (with no input parameters) anyway. I prefer not to allow overloads to vary in terms of purity, but I have no strong opinion about it.

Wrt. 3) Yes, that's my understanding to. I agree that the original formulation is not exactly clear

ldcouto commented 9 years ago

1 - Yeah, pure constructors seems like a bad idea.

2- Are overloads treated as separate operations with regards to visibility and pre/post conditions? Whatever the answer, it should be the same for purity.

3- Are functions allowed to show up in history expressions?

On 14 July 2015 at 20:10, Peter Tran-Jørgensen notifications@github.com wrote:

Wrt. 1) I think it is reasonable not to permit pure constructors since they are intended to initialise state.

Wrt 2) Although I think it's possible to allow overloads to vary in terms of purity I think it will be of very little value. Especially considering that pure operations will mostly be getters (with no input parameters) anyway. I prefer not to allow overloads to vary in terms of purity, but I have no strong opinion about it.

Wrt. 3) Yes, that's my understanding to. I agree that the original formulation is not exactly clear

— Reply to this email directly or view it on GitHub https://github.com/overturetool/language/issues/27#issuecomment-121327021 .

nickbattle commented 9 years ago

Overloads are essentially separate operations, and the caller is bound using the types of the arguments passed (in addition to the name). There are type checks that overloads can be distinguishes by the types of their parameters (else the binding would be ambiguous). So I suppose one overload might be pure, while another isn't, and that isn't a contradiction.

Functions cannot be named in history counters, no. They must be explicit operations. And since overloads are not distinguishable by (say) #act(op), there is also a check that all of the "op" overloads have the same "static" setting. So I suppose we should have a rule that none of the overloads of a history operation are pure?

I'm close to having something which does all the pure type checking in VDMJ. I'll build/release that first so you can all play with it, then the runtime checks can come separately.

nlmave commented 9 years ago

Agree on the constructor and operator overloading but I fail to see why pure operations are not allowed in permission predicates or history expressions. You mean to imply that an invariant cannot be checked in case the pure operation is blocked? Is that what you mean? Otherwise can you give a concrete example?

nickbattle commented 9 years ago

We've said that pure operation calls cannot be blocked. So they cannot have a permission predicate that evaluates to false. We could make that a runtime error, but it seemed odd to me to permit the specified to write something which appears to describe a synchronization ("wait until...") but actually is a precondition ("fail unless..."). So it felt more like a static type checking thing - "don't try to synchronize pure operations; if you do, you're thinking along the wrong lines."

Regarding history expressions, like #fin(pureop), we know that the execution of a pure operation is not allowed to influence anything else, and so it cannot increment history counters for itself (which could otherwise unblock and hence influence the specification). So I think that implies that history expressions cannot refer to pure operations. I suppose an alternative would be that all history counters for these operations always return zero? But if they never change, it is again misleading to allow them in sync clauses?

pglvdm commented 9 years ago

I would agree with Nick that the pure operations shouldn't have side effects, also not on history counters and surely not have permission predicates since that would be blocking in the evaluation of a pre-condition for example.

ldcouto commented 9 years ago

Yeah, I think I'm with you guys. Pure operations should not be allowed in history expressions. It would just confuse the users. Do we disallow them entirely in permission predicates or can they show up on the RHS?

per op => pureop(x)
tomooda commented 9 years ago

I think pure ops should be callable in state guards by the same reason why we need pure ops in preconditions.

pglvdm commented 9 years ago

I agree.

nickbattle commented 9 years ago

Yes Luis, that sort of comes out in the wash. The permission predicate is a function context, so you can only do things from there that you can do in functions/invariants. But that now includes calling pure operations. If you try to call an impure operation from a per predicate however, you get an error:

per op => impop() > 0
...
Error 3300: Impure operation 'impop' cannot be called from a function in 'A' (test.vpp) at line 18:18
nickbattle commented 9 years ago

OK, I've uploaded a VDMJ jar which I think does all the right type checking. There are no runtime changes at present (atomicity and so on). Note, you must run this in VDM10 mode to see the new changes (-r vdm10). Please give it a try and let me know if anything is unexpected.

http://sourceforge.net/projects/vdmj/files/Jars/vdmj-3.0.1.jar/download

The following nonsense spec provokes every type checking error (shown below):

class A
instance variables
    public x:nat := 0;

values
    pure v = 123;

operations
    pure A: nat ==> A
    A(n) == x := n;

    pure public op: () ==> nat
    op() ==
    (
        x := impop();
        return x
    );

    public impop: () ==> nat
    impop() == return op()
    pre x > op()
    post x < op();

sync
    per op => #fin(op) > 0

end A

class B is subclass of A
operations
    public op: () ==> nat
    op() == return 0;

end B

class Test
functions
    public f: A -> nat
    f(a) == a.impop() + a.pre_impop(a);

end Test

Parsed 3 classes in 0.173 secs. No syntax errors
Error 3341: Overriding definition must also be pure in 'B' (test.vpp) at line 31:12
Error 3286: Constructor cannot be 'async', 'static' or 'pure' in 'A' (test.vpp) at line 9:10
Error 3338: Cannot update state in a pure operation in 'A' (test.vpp) at line 10:13
Error 3339: Cannot call impure operation 'impop' from a pure operation in 'A' (test.vpp) at line 15:14
Error 3338: Cannot update state in a pure operation in 'A' (test.vpp) at line 15:9
Error 3340: Pure operation cannot have permission predicate in 'A' (test.vpp) at line 25:9
Error 3342: Cannot use history counters for pure operations in 'A' (test.vpp) at line 25:20
Error 3300: Impure operation '(a.impop)' cannot be called from a function in 'Test' (test.vpp) at line 39:13
Error 3091: Unknown member pre_impop(A) of class A in 'Test' (test.vpp) at line 39:27
Type checked 3 classes in 0.014 secs. Found 9 type errors
Bye
nickbattle commented 9 years ago

Oops, that "pure" value definition should give the following error (I only tested this with SL!). I'll update the jar.

Error 2324: Pure only permitted for operations in 'A' (test.vpp) at line 6:5
Parsed 3 classes in 0.175 secs. Found 1 syntax error
Bye