overturetool / language

Overture Language Board issue tracking
2 stars 0 forks source link

"Pure" operations called in functions #27

Closed joey-coleman closed 8 years ago

joey-coleman commented 10 years ago

The following bug was originally reported on Sourceforge by ldcouto, 2014-07-28 06:14:14.783000:

1 Identification of the Originator Luis Diogo Couto

2 Target of the request: defining the affected components of the language Introduce "pure" operations and allow them to be called inside functions. Pure operations do not affect the state of any object.

3 Motivation for the request When using an object as a function parameter, it is often necessary to query the internal state of the object. A typical example is when the object is used as an operation parameter and its internal state needs to be inspected in the pre condition.

The only way to do that at the moment is to make the relevant instance variables public, which breaks encapsulation.

4 Description of the request, including: (a) description of the modification: Allow pure operations to be called inside functions. A pure operation does not affect the state of its object nor of any other. It is referentially transparent.

Pure operations can only read instance variables and call functions and other pure operations.

A new keyword "pure" can be introduced to identify these operations.

Another option is to use the externals clause. An operation that contains ext rd x is pure. Currently such an operation can call any operations on x so the semantics of ext rd would be altered.

(b) benefits from the modification: When using an object as a function parameter it would be possible to check (or get) data that depends on the internal state of the object. This could be done without exposing the internal state of the object, which is an important principle of OO design.

This would be particularly helpful if we want to use objects in pre/post conditions and invariants, which are signature features of VDM. Pure operations would be particularly suited to implement the checks that these conditions rely on.

(c) possible side effects: If the new keyword is introduced, current models that use it as a name will no longer be correct. For what it's worth, none of the overture examples use the word pure as a name.

If the semantics of ext rd are altered, models that use it may no longer be correct. 11 examples use ext rd, though I have not checked the bodies of those operations.

5 Source code and technical documentation where applicable Documentation: new keyword or updated semantics of ext rd need to explained in the language manual.

Source code: the new keyword would require changes to the parser. Both versions would require changes to the type checker.

6 A test suite for validation of the modification in the executable models. N/A

nickbattle commented 9 years ago

Ah, it's obvious! The type of the state map parameter should just be "map of (seq of char) to (union of the types of all the variables referenced by the body)". I may implement that as "?" internally, but the signature would be logical and (most importantly) legal VDM.

So should we use these maps for both "self" and "self~" parameters?

kgpierce commented 9 years ago

Okay, this sounds good. I think we should use the map for self~ as well, since I'm assuming that means we're taking a snapshot at the end of the operation and is similar to the "instant" execution of pure operations mentioned in earlier discussions.

nickbattle commented 9 years ago

OK, so the two pre/post signatures are as follows:

op: P1 * P2 ==> R
op(p1, p1) == ...

M = map (seq of char) to (union of state variable types accessed by op body);

pre_op: P1 * P2 * M +> bool
pre_op(p1, p2, self) == ...

post_op: P1 * P2 * R * M * M +> bool
post_op(p1, p2, RESULT, self~, self) == ...

If the operation has no parameters, the Pn entries are missing from the signatures. If the operation returns no result, the RESULT is missing from post_op. If the body accesses no state, the M parameters are missing(?), or are they [M] parameters - but then what is the range of M anyway?

I'll try to find out what VDM-SL does for operations that are declared in a module with no state.

[Edit: if there is no state, VDM-SL omits the "sigma" parameters from pre/post_op. This does not happen if the operation does not access state, but only if there is no state visible to the operation. In the case of VDM++, I think we should omit the self/self~ parameters if the body does not access any state.]

tomooda commented 9 years ago

Ah, Shin, you debugged Nick's example at https://github.com/overturetool/language/issues/27#issuecomment-60204255 . I see the same error message, too, and the accessibility of iv2 looks as Nick explained. What VDMTools says about accessibility of iv2?

nickbattle commented 9 years ago

Yes, sorry about that bool/nat confusion. I was concentrating on the accessibility errors, to check which instance variables were visible in the pre/post depending on the ext clause. As I was experimenting with the spec, I introduced (and ignored) the other errors. Changing it to "pre iv2 > iv1" or similar gets rid of the problem.

kgpierce commented 9 years ago

The LB decided to summarise the new proposal to help others join in the discussion. The summary will appear here in a few days.

ldcouto commented 9 years ago

Hi everyone.

I'm very happy to see that there has been steady progress on this. However, I am a bit worried that it was decided relatively quickly to break encapsulation at pre- and post-condition level.

It will mean that whoever writes the assertions will need knowledge of any class used as a parameter in order to write those assertions. That seems counter to the point of using OO, particularly in multi-modeller scenarios.

I also think this solution does not address the problem for inheritance. If the super class defines the operation and different subclasses provide different implementations (with different states), how will we write a meaningful assertion at the super class level?

I'm sorry to be so negative but I think we should fully consider this before committing to something and I'm not convinced visibility in pre/posts is the right way to go. I do see the point about breaking referential transparency (sort of). So maybe pure operations should not be called in functions and we should come at it differently. Perhaps VDM++ assertions should be pure operations instead of functions?

tomooda commented 9 years ago

Luis,

I also think this solution does not address the problem for inheritance. If the super class defines the operation and different subclasses provide different implementations (with different states), how will we write a meaningful assertion at the super class level?

I agree. I talked to LB members, and I'll put an example that exhibits this inheritance problem after Nick will put a summary.

kgpierce commented 9 years ago

Hi Luis, welcome back. Please don't worry that things have been decided in your absence, we are still very much in the "initial consideration" phase. There are a number of issues with OO and they are hard to solve, so I expect lengthy discussions. I think it's great that we're having this one and I'm sure that we'll make good progress.

nickbattle commented 9 years ago

This is a summary of an alternative approach to giving pre/post functions access to object state. It is not without its problems, but it avoids the constraints that we would have to place on pure operations (atomic, non blocking, pre/post only etc).

​The proposal is to adopt a similar approach to VDM-SL in its definition of pre_op and post_op functions. In VDM-SL the state (including the original old~ state in post_op) is passed to the functions as data - effectively a copy of the actual state data. This is simple in VDM-SL as the whole visible state is always contained in a named record type.​ The bodies of the functions are wrapped in an implicit layer which creates local definitions for each of the individual state variables, so that the body can simply refer to "var" rather than "param.var" (where param is the name of the state parameter passed to the function).

​With VDM++, the state that is visible to an operation comprises the tree of objects reachable from "self" (if it exists), plus the visible static fields in the specification (if any). Therefore we cannot pass the entire state in a simple record value as we can with VDM-SL.​ So the proposal is to pass a map of names (strings) to values, where the names are at least the names referred to in the body of the function. The implicit layer for VDM++ would similarly convert these values to local definitions, so that the body of the function can refer to them as if they were being accessed directly. ​So the pre/post functions for an operation would be as follows:

op: P1 * P2 ==> R
op(p1, p2) == ...

M = map (seq of char) to (union of state variable types accessed by op body);

pre_op: P1 * P2 * M +> bool
pre_op(p1, p2, self) == ...

post_op: P1 * P2 * R * M * M +> bool
post_op(p1, p2, RESULT, self~, self) == ...

We would also have to change the syntax of "ext" clauses to allow fields like "obj1.obj2.var" to be referred to, since the grammar (based on SL) only currently allows simple identifiers. If an ext clause exists, the domain of the state maps passed would be limited to the variables named.

​Unfortunately, this approach also has problems. Firstly, to be able to access an arbitrary state variable like "obj1.obj2.var"​, the normal public/private access rules would have to be suspended within pre/post functions. If we did not do this, the function could not reason about private data that has been read or modified. Secondly, in a polymorphic hierarchy an object reference of type "Thing" may actually refer to an implementation of type "SubThing" which does not use the same variables to implement its state. So by allowing pre/post functions to access state directly, this seriously hinders good OO design practice and encapsulation.

​The benefit of this approach is that it is in keeping with VDM-SL, does not break the referential transparency of functions​, or require the semantic constraints (atomicity, non-blocking, limited usage etc) that pure operations do.

tomooda commented 9 years ago

I gisted 3 "supposed to be equivalent" specs because they are way too long to put here :-) https://gist.github.com/tomooda/7076f69cb98b04fe8cbd The definition of UsePool class in each file (pre_ops, pure ops and pure ops and pre_ops) is the key point. The UsePool class use an abstract Pool class subclassed by Stack and PriorityBuffer each of which has its own internal structure. An inst var typed Pool holds an object of either Stack or Priority class. The problem is how to write precondition of operations with regard to the inst var.

nickbattle commented 9 years ago

Thanks for these examples, Tomo.

With the pure operations approach, where does the state come from for the objects that are being called? In the case of "old" state in the post_op, when you write something like obj~.getPure(), how is that old object state instantiated? I can see how we can (in principle) directly call pure operations on objects in the "real" environment for the "current" state in pre_op or post_op, but for the old state in post_op we still have to have a way to instantiate old objects in order to call their pure operations, surely?

And in complex cases, that old state might involve creating an arbitrary tree of old objects (eg. if getPure() calls further pure operations on other objects).

tomooda commented 9 years ago

obj~.getPure() and alikes are evaluated right before the operation body in arbitrary order. Because none of them can cause a side-effect, the order of evaluation within old state doesn't matter.

kgpierce commented 9 years ago

Just a note that Anne mentioned in her experience "pure" means no read or write and "read-only" is used where there are only reads and no writes. John also mentioned further up that he didn't feel "pure" was the right word.

kgpierce commented 9 years ago

Do we have to keep pre- and post-conditions as functions? Could we make it so that pre_op and post_op are generated as (read-only) operations?

tomooda commented 9 years ago

As for wording, I do NOT think the side-effect free operations that we are discussing here is "pure" in general functional programming terminology because it is not referentially transparent. On the other hand, JML calls methods that just read variables "pure method". I think we can discuss wording after gripping concrete spec for the new feature.

tomooda commented 9 years ago

Nick, how about an M type value that can hold "myObj~.getPure()" |-> along with inst vars?

nickbattle commented 9 years ago

Interesting idea Tomo. I had only envisaged the domain of the map being the name of an instance variable, but I suppose in general it could be any "phrase" that parses to something that is referred to in the body. That would avoid the problem that we have otherwise of constructing an arbitrary tree of old objects (these objects may have exotic constructors, that in principle could themselves affect state outside their own object!).

Ken, I'm not sure why changing pre/post_op functions into operations is a bad idea, but it sounds like we would be moving even further away from the principles of VDM-SL. Doesn't this undermine the (future) hope of having proof rules for VDM++? Besides, we still need to figure out how to pass the old state the operation as currently?

kgpierce commented 9 years ago

I was just thinking out loud. It's one way to avoid breaking referential transparency for functions (by having operations generated instead of functions for pre-post), but yes it might move us too far from the original dialect. However we can prove things about operations in VDM-SL.

Re: naming, I'm happy to discuss later of course, I just wanted to mention it here since Anne's message was to the LB list.

kgpierce commented 9 years ago

I'll talk to Cliff Jones on Tuesday to see if he has some insight. He did some work with OO languages a while ago and may be able to offer insight here.

nickbattle commented 9 years ago

Good idea, Ken. I asked John Fitzgerald too - his PhD thesis was about modular SL specifications, and I asked how/whether an equivalent issue in SL was tackled, since with modules you can import an operation and call it, but then you can't see its state in the pre/post_op functions. He kindly sent me a copy of the thesis, which I have yet to digest :)

kgpierce commented 9 years ago

Hehe, you walked into that one. If I catch John over coffee I will see if he has a more succinct reaction.

JohnFitzgerald commented 9 years ago

He would have to read his thesis as well, in order to remember how he handled it back then:) Will discuss with Ken when I get over the jet lag (just landed back in UK).

J

From: Ken Pierce [mailto:notifications@github.com] Sent: 31 October 2014 13:58 To: overturetool/language Cc: John Fitzgerald Subject: Re: [language] "Pure" operations called in functions (#27)

Hehe, you walked into that one. If I catch John over coffee I will see if he has a more succinct reaction.

— Reply to this email directly or view it on GitHubhttps://github.com/overturetool/language/issues/27#issuecomment-61263475.

kgpierce commented 9 years ago

Hi, I didn't manage to chat to John or Cliff properly yet. I'll be away now until 27th November. Perhaps Nick you could see what can be done to keep us moving here. Any luck reading John's thesis?

nickbattle commented 9 years ago

OK, I'll see whether I can get some input from them. I haven't literally read John's thesis yet (but I imagine he hasn't either!).

ldcouto commented 9 years ago

Thinking more about this, I think I'm becoming more in favor of just using operations for pre/post-conditions and invariants. It just seems like objects and functions interact poorly.

I know it sends us further away from VDM-SL, but is that necessarily a bad thing? There's also the matter of VDM classic and how close we want to stay to do that.

On a related note, I've been discussing this (and related issues) with @peterwvj and one thing we keep going back to is the origin of VDMPP. Is there are any kind of document that describe the creation of VDMPP, particularly some of the design rationales? Might be a useful resource.

nickbattle commented 9 years ago

Not sure I agree about pre/post_op as operations, but I'll try to get input from John/Cliff about that.

It's a very good point about the origins of VDM++, or "What were they thinking!!?" as it seems to me :) There are a few papers around that discuss this - I have seen a few via PGL. But as I recall, most of the them are fairly shallow, along the lines of "objects are cool" rather than "we have some fundamental difficulties with the design of a consistent object based formalism". Perhaps I'm being a little unfair, let's ask for the papers, but that is my recollection.

ldcouto commented 9 years ago

Fair enough, Nick. Any particular reason you're against it or is it just the widening gap between PP and SL?

I will say that this can be done (sort of) with VDM-classic. You just wrap your assertions in a bool-returning operation and call that in the pre/post/inv entries.

nickbattle commented 9 years ago

I think it is the widening gap - or really that I think the way SL approaches it is for very good reasons, which allow SL specifications to be analysed formally, and that by changing this without a full appreciation of the ramifications for the proof theory (that we don't have) we might be forever damning VDM++ specifications to be "unprovable".

ardbeg1958 commented 9 years ago

Hi all

After reading this loooong discussion, an idea came across my mind. (I’m afraid that it seems too ‘pragmatic’ :-) Since an operation can already access state vars, why not in pre/post also?

In the language manual, it is now stated like:

explicit operation definition = identifier, ‘:’, operation type, identifier, parameters, ‘==’, operation body, [ ‘pre’, expression ], [ ‘post’, expression ] ,

How about we change this definition as following ? :

explicit operation definition = identifier, ‘:’, operation type, identifier, parameters, ‘==’, operation body, [ predef ], [ postdef ],

predef = ‘pre’, expression | ‘pre_state’, statement

post def = ‘post’, expression | ‘post_state’, statement

In pre/post expressions, you are still not able to access operations and object references, but you may access them in pre_state/post_state statements.

In stack example, we now can write down like:

operations public Push1 : int ==> () Push1 (i) == myStack.Push((i)) pre_state myStack.CanPush(); // You can access object references and operations in pre_state

Maybe we can invent a smarter way if we have more precise semantics for PP part (like SL part :-) . But in the mean time, we can still keep roles of functions and operations separated.

annehax commented 9 years ago

It also occurred to me, like Icouto, that it would seem natural to allow operations in pre/post conditions. However, these must be readonly, which requires that the type checker checks that there are no write accesses. Without knowing your tools in detail, this may imply a major change. A note on my experience form RSL: here we allow pre and post conditions to be read-only expressions, which in particular means that we allow to apply what you call operations. In RSL this is easy, because we do not distinguish syntactically between functions and operations, we have a notion of functions that may or may not access variables. However, allowing read-only expressions means that the type checker must check that there are no write accesses.

The description just above also seems as a solution.

ardbeg1958 commented 9 years ago

This is what I wrote in github prior the last NM. Since It doesn't seem to appear in this mail-thread, I repeat the message here again. Sorry for the duplication.

Hi all

After reading this loooong discussion, an idea came across my mind. (I’m afraid that it seems too ‘pragmatic’ :-) Since an operation can already access state vars, why not in pre/post also?

In the language manual, it is now stated like:

explicit operation definition = identifier, ‘:’, operation type, identifier, parameters, ‘==’, operation body, [ ‘pre’, expression ], [ ‘post’, expression ] ,

How about we change this definition as following ? :

explicit operation definition = identifier, ‘:’, operation type, identifier, parameters, ‘==’, operation body, [ predef ], [ postdef ],

predef = ‘pre’, expression | _‘prestate’, statement

post def = ‘post’, expression | _‘poststate’, statement

In pre/post expressions, you are still not able to access operations and object references, but you may access them in pre_state/post_state statements.

In stack example, we now can write down like:

operations public Push1 : int ==> () Push1 (i) == myStack.Push((i)) _prestate myStack.CanPush(); // You can access object references and operations in _prestate

Maybe we can invent a smarter way if we have more precise semantics for PP part (like SL part :-) . But in the mean time, we can still keep roles of functions and operations separated.

kgpierce commented 9 years ago

(Sorry for the delay in writing the summary, it's quite tricky!)

I wondered if another option is to allow pre-conditions to be called in pre-conditions of other objects...

class Stack

instance variables
s: seq of int :=[];

operations

public Pop: () ==> int
Pop() == let h = hd s in (s := tl s; return h)
pre s <> [];

end Stack

Then the pre-condition of Pop (which can see local state) could be referenced from the other class...

class UseStack

instance variables 
myStack: Stack := new Stack();

operations
public PopMyStack: () ==> int
PopMyStack() == myStack.Pop()
pre my_Stack.pre_Pop();

end UseStack
ldcouto commented 9 years ago

At the moment, the quoted pre-condition of Pop (pre_Pop) is a function and every relevant instance variable is passed to it so it cannot really "see" state. Nick discussed this a bit starting here: https://github.com/overturetool/language/issues/27#issuecomment-59647707

From your suggestion Ken, are you also in favor of making pre-conditions read-only operations then?

kgpierce commented 9 years ago

Since I'm trying to work out a summary of the various options, I'm just wondering what the various options are in the sense of a) do they (sufficiently) solve the problem and b) what are the consequences. So my suggestion above is to allow pre-condition functions to call other pre-condition functions (unless they can do this already?). Another alternative a few of us have mentioned is to "upgrade" pre-conditions to become read-only operations. I don't know what I'm in favour of yet.

By the way, are pre-conditions referentially transparent at the moment despite this state-passing?

nickbattle commented 9 years ago

​You can call pre/postcondition functions from another location like this currently in VDMJ, but (since they are functions!) you have to pass the state to them as arguments, rather than depending on the object reference via which you call them. ​They can't really "see state" in that sense; they are pure functions. Postconditions need both before/after state passed.

Even if we solve how to pass state, I'm not sure that a precondition is always sufficient - it may work in this example, but you could make up a different case where you would need to call a getter operation or access instance variables directly.

nickbattle commented 9 years ago

Yes Ken, pre/postcondition functions are genuine functions. The have a layer of "trickery" to populate the environment with variables with the same names as the state before the body is called, but they are absolutely genuine functions.

ldcouto commented 9 years ago

I was just about to say the same thing but Nick beat me to the punch so instead I'll try to summarize:

I think, broadly, we have discussed two avenues:

Any other possibility? I know we've discussed some way to "properly" pass state to pre-conditions/etc. but it seems like that idea is a bit behind the others.

--ldc

nickbattle commented 9 years ago

Well, I would say that the "pass state properly" approach is suffering in the race because it is difficult, but I would be reluctant to throw it out unless the alternatives avoid undermining the integrity of the language - breaking referential transparency, allowing pre/post/inv checks to affect behaviour, blowing away any chance of a tractable proof theory etc.

kgpierce commented 9 years ago

Nick, thanks for the input. It might be good to if we can come up with "different case where you would need to call a getter operation or access instance variables directly".

Luis, we also mentioned allowing pre-/post-conditions to read state of objects, but this doesn't really work for maintaining OO encapsulation or when subclassing is involved.

It seems either way we need some form of read-only operations? Either "pre-conditions and friends" have to be read-only, or "restricted form of operations" are read-only.

kgpierce commented 9 years ago

Okay, so the "pass state properly" approach is my "allowing pre-/post-conditions to read state of objects"?

ldcouto commented 9 years ago

I suppose your "allowing pre-/post-conditions to read state of objects" is a mix of "pass state properly" and "pre-conditions pierce encapsulation".

The first part is an interesting challenge. The second, I think is potentially problematic (as you said, gets messy with subclasssing).

nickbattle commented 9 years ago

To me, passing state (as an argument to a function) is very different to having something (an operation, by definnition) which accesses state directly.

I'll try to cook up an example where a precondition quote is insufficient, but surely, in general a pre/post of an operation that calls other objects' operations needs to be able to check more than those operations' preconditions? If operations increment a value and return it, and their preconditions check that the value is >0, the logic of the calling postcondition may still need direct access to the variable value; the precondition >0 test is not sufficient.

kgpierce commented 9 years ago

So we could allow quoting of called post-conditions as well perhaps? If the calling operation isn't directly manipulating the state of the called object, it can't guarantee more than the called objects operations anyway.

ldcouto commented 9 years ago

I gotta say I still haven't fully grasped this quoting approach.

If we call my_Stack.pre_Pop(), what exactly would be the arguments of the precondtion? Is it the same as the Pop() operation, ie, none? Or do we have to pass an object reference to it somehow?

kgpierce commented 9 years ago
class Countdown

instance variables
clock: int := 10

operations

public Reduce: () ==> int
Reduce() == (
    clock := clock -1;
    return clock
)
pre clock > 0
post clock = clock~ - 1

end Countdown

 

begin Bomb

instance variables
cd: Countdown := new Countdown()

operations
public Tick: () ==> [<Boom>]
Tick() == 
    let count = cd.Reduce() in 
        return if count = 0 then <Boom> else nil;
pre cd.pre_Reduce()
post cd.pre_Reduce() or RESULT = <Boom>

end Bomb

I was going to call cd.post_Reduce() in the post-condition but it's not needed in my silly example.

nickbattle commented 9 years ago

OK Ken, but if your example didn't just reduce the time, but called (say) TimeRemaining, which returned the private variable, and then did somethnig depending on that, you would need to call TimeRemaining in the postcondition - or access the private variable. The precondition would not be sufficient, even though it's a perfectly good precondition.

nickbattle commented 9 years ago

Luis, you can see the form of the pre and postcondition functions in an earlier comment: https://github.com/overturetool/language/issues/27#issuecomment-60588517

ldcouto commented 9 years ago

Nah, I have to disagree there Nick. With pre/post conditions, an external should be able to simply rely on them and be blissfully ignorant of any other details. Therefore, if the post-condition of Clock is insufficient, it's on the modeler to fix it.

Thanks for the clarification on the how the functions look. I got it now (seems challenging implement).

Also, and not to pile on any further but... how does this approach work with objects as parameters? In essence, the precondition of the operation depends on some attribute of the parameter.

kgpierce commented 9 years ago

Nick, Luis, both interesting points. I'll see if I can modify the example sensibly to create something more to bring this point out.