overturetool / overture

The Overture Tool
http://overturetool.org
GNU General Public License v3.0
48 stars 25 forks source link

Failed detection of abstract class possibly due to operation types #366

Closed ldcouto closed 8 years ago

ldcouto commented 10 years ago

This issue is related to type checker errors when attempting to instantiate abstract classes.

If a subclass does not implement any of the is subclass responsibility operations then the subclass is itself abstract and should not be instantiated. The type checker normally detects this and issues an error when you try to invoke the constructor.

However, if the subclass has an operation with the same name as the super class but different parameters, then no error is issued.

Here is a minimum example:

class A
operations
public op : int ==> int
op(x) == is subclass responsibility
end A

class B is subclass of A
operations
public op : int * int ==> int
op(x,-) == return x+1
end B 

class Test
instance variables
x : A := new B();
operations
public test : () ==> int
test() == x.op(1);
end Test

For reference, I tested this in test build 140: commit 09a6683

nickbattle commented 10 years ago

This is a nasty case, but I agree we should probably clean it up.

What's happening is that the override of op in B is hiding the (apparent) overloaded operation from A. C++ works this way for overloaded name binding (and much of VDMJ's OO-ness is more like C++ than Java), but I suspect you are right that B would still be considered abstract, even in C++ (I'll try when I get a minute). I'm also not sure what would happen in C++ when you attempt to access the B`op via an A reference. If you try via a B reference, the type checker would say that there was no such operation. Basically, C++ overloaded name resolution does not search the hierarchy. Strange, but true. VDMJ is trying to do something similar, and possibly being inconsistent about it. But then VDM++'s OO semantics are very poorly defined.

The following stackoverflow question gives some C++ background: http://stackoverflow.com/questions/888235/overriding-a-bases-overloaded-function-in-c

ldcouto commented 10 years ago

Well, certainly we can define what the behavior should be.

I don't know if the current behavior should be deemed valid. Personally, I lean more towards Java than C++ OO so I might say no.

nickbattle commented 10 years ago

We should absolutely define the behaviour, yes. And in the case of abstract classes, I suspect even C++ would regard the subclass as abstract, though I haven't tried it yet.

In the few places where C++ and Java OO semantics differ, I think that C++ usually seems more sensible, in the sense of keeping behaviour of classes well separated. So for example, if an overridden method is called by a constructor, Java calls the overridden method (even though the subclass has not yet started its construction), whereas C++ calls the original method, keeping that class' construction independent of whether it not it appears in a class hierarchy.

That sort of sensible separation means that we stand a chance of using VDM++ to define classes with provably consistent behaviour that doesn't need re-proving when the class is composed in a hierarchy. So I think we should keep this sort of thing in mind when we define VDM++ OO semantics!

ldcouto commented 10 years ago

I've been thinking about this and should we consider B.op to be an override of A.op, seeing as how they have different parameters?

I guess the broader question is "are parameters (or their types) part of an operation's name?"

peterwvj commented 10 years ago

I agree. Parameters are part of the signature of the operation (not the name) and I think ops/funcs should be compared based on the signature rather than the name. Since we allow op/func overloading that's kind of what we do already.

nickbattle commented 10 years ago

​So we're debating whether to bind VDM++ names like Java (where overloaded signatures are separate, I think) or C++ (where overloads are "local").​ I think I mentioned before, C++ often does things for very good reasons - the sorts of reasons that would make an OO language much easier to analyse formally. So while I share the intuitive view that Java's way of doing things "makes sense", I would like to understand why the C++ committees decided as they did before making VDM++ the same [as java]. There is bound to be a reason, and we need to make sure it isn't one that we care about [if we make it like Java].

[Edited for clarity]

ldcouto commented 10 years ago

Solid reasoning, Nick. As a C++ neophyte, I'm not sure what is the best place to look for these kinds of design decisions. Can you suggest a reference?

I will say, as a preliminary thought, that Java's absence of multiple inheritance "simplifies" a lot of issues so that may allow them to make decisions that lead to intuitive interpretations.

I can see a case for either approach. I just want it to be consistent and documented, whatever it ends up being.

nickbattle commented 10 years ago

I'm rather rough on C++ myself, I'm afraid. We will have to resort to Mr. Google. There were some clues in that Stackoverflow URL I added earlier - I think the issue is along the lines of (not) affecting a class' behaviour when it is composed in a hierarchy. If you have complex binding rules (which C++ does) which allow op(arg-type) to bind to definitions that are not identical to that type (but rather, can be coerced to that type or constructed by using the argument passed), then there is a danger that a method which calls that would "magically" call something else when called in a subclass that provides a "more attractive" overloaded binding. That would mean that, although the superclass was well tested and perfectly thought through, in the hierarchy it could fail.

I'm guessing a bit (can you tell?!) but I think that's the essence of the problem. Constraining overload bindings to one level in the hierarchy would protect the superclass in this case. We have an analogous problem in VDM++: if we have created a class and analysed it to death - possibly by formal proof - we want that assurance to stand absolutely firm when we compose hierarchies with the class. It's the same with superclass constructors (not) calling overridden methods. (This is also the argument for having Strong Behavioural Subtyping of overrides, incidentally).

​I think it's fair to say that a lot of Java's design decisions were ​ ​definitely trying to simplify the perceived complexity of C++. ​ ​But I think it's also true that several of their simplifying changes are not as "clean" as C++ strives to be. And VDM++ has to be squeaky clean! :-)​

nickbattle commented 10 years ago

I've just tried the equivalent of the original example in C++ - ie. an "overloaded" function in a subclass of a class with a pure virtual (abstract) function. The compiler says that the subclass cannot be instantiated because it is abstract.

This makes sense to me. The class is abstract, and so (in our case) this is a bug. The related issue of "how should overloading and overriding work!!?" is a distraction (albeit important too).

ldcouto commented 10 years ago

I certainly agree that Java is not as clean as it claims to be. Particularly, since it's in version 8 and all that. But one thing I find Java has going over C++ is that it tends to be less confusing from a fresh user perspective.

I agree that VDM needs to be well-defined or the value of any resulting analysis is low. But it also needs to be clearly understood by its users. Modelling should abstract stuff away and I think there is not much value in "struggling" to get a complex inheritance and override hierarchy working. Especially since the implementation language may change said hierarchy.

So I guess I agree that we need the type checker to detect this (it seems statically decidable) and we need to separately consider a clean definition of overloads and overrides. I think @peterwvj actually suggested not having overloads ^^

peterwvj commented 10 years ago

With respect to overloading we could avoid having the user "struggling" with understanding complex rules for inheritance by not allowing overloads. We have already seen that it causes a lot of complexity during semantic analysis and as far as I recall Nick pointed out a few more in his OO issues paper.

And even without inheritance you can do things that are quite confusing. In Java you can end up in situations where the type checker cannot decide what operation is being invoked (ambiguous call). Furthermore, if you have op(double d) and op(int d) declared then op((double)1) and op(1) are invocations of different methods.

Honestly, I'm not sure that the pros of overloading outweighs the cons.

ldcouto commented 8 years ago

Arise, ancient bug report!

There is still no real progress on this. I think we need to come to a decision and address this one way or another.

This is a very annoying bug. I'm currently in the process of refactoring a class hierarchy and cannot rely on the type checker at all to help me identify which subclasses still need to be addressed.

Has any further thought been given to banning overloads? I was talking to @lausdahl the other day and there some issues in C code generation that would also be helped by that.

nickbattle commented 8 years ago

I'm not sure what you mean by "banning overloads". VDMJ takes the C++ position that inherited names should not "mix" with locally defined names in the overloading mixture; rather a local name hides all the (otherwise) inherited names. There was a bug (now fixed), where the type checker enforced this, but the runtime environment did not, which could lead to problems with union types (where the TC could not know which type would be passed at runtime, and therefore could not decide the static binding).

I'll look at it again, while it's relatively fresh in my mind from the bug above. But perhaps you can see the problem: it is hiding the abstract inherited name because of the local definition, but that then also hides the fact that the inherited definition needs to be overridden to avoid leaving an abstract class. So it looks like the hiding needs to be a bit more intelligent!

ldcouto commented 8 years ago

Now I'm not sure what you mean. If a local name hides an inherited name, is that not them mixing?

Banning overloads means you cannot overload an operation, inherited or otherwise.

nickbattle commented 8 years ago

We want overloads (two or more ops/fns of the same name, with distinguishable parameters). We obviously must have overrides (subclass specializing the same name/params from a parent) and inheritance. But it's not clear what to do if you inherit a name that "overloads with" a local name. In C++, the inherited name is hidden in this case, so there is only the local name (and any local overloads) visible. I think Java mixes inherited names with local overloaded names.

Anyway... I just had a look at this and made an experimental tweak. VDMJ now does the following for your test spec:

Parsed 3 classes in 0.172 secs. No syntax errors
Error 3330: Cannot instantiate abstract class B in 'Test' (test.vpp) at line 15:10
Unimplemented: op(int ==> int)
Type checked 3 classes in 0.011 secs. Found 1 type error

But unfortunately a dozen or more of the CSK test suite tests are giving unusual errors, so I need to work through these and figure out whether they are right or not.

ldcouto commented 8 years ago

Why do we want overloads?

nickbattle commented 8 years ago

Well, it's a fair question. I suppose it is "natural" to have several operations that do the same thing, but do it with different parameters, and you would "naturally" want to give them the same name and let the compiler work out the binding. Otherwise you'd have to call them procInt, procChar, procObject etc.

But perhaps that wouldn't be so bad. They're not super essential, but they're very common in languages (OO and otherwise).

ldcouto commented 8 years ago

Yeah, I know the question was kind of obtuse.

But I was just trying to make the point that there is a cost to be paid for having overloads and, like you said, there is a relatively simple workaround for the users. So maybe we should not just say that we must have them.

In SL, overloads yield a "duplicate definition" warning. And if you try to run the ops, they both give an error. That's actually pretty great :)

> p op(1)
Error 4087: Cannot convert 1 (nat1) to char in 'DEFAULT' (LaunchConfigurationExpression) at line 1:1
Runtime: Error 4087: Cannot convert 1 (nat1) to char in 'DEFAULT' (LaunchConfigurationExpression) at line 1:1
> p op('a')
Error 3064: Inappropriate type for argument 1 in 'DEFAULT' (LaunchConfigurationExpression) at line 1:1
Expect: int
Actual: char
nickbattle commented 8 years ago

It's true that there is no overloading in SL. In fact, I'm surprised it's only a warning. You could make the case for removing overloading from VDM++/RT (on the grounds of simplicity) and put it to the LB as an RM. But one clear objection would be that it is removing a fairly fundamental part of the language that many existing models depend on. (A better RM would be to define the rules formally! But perhaps we will get that with the UTP model?)

In the meantime, this issue is clearly a bug and I ought to fix it :)

peterwvj commented 8 years ago

I think it would be much better if we change duplicate definitions in SL to produce a type error instead of a warning. Is there a good reason why it is not like this? If you agree I can set up an issue for this.

ldcouto commented 8 years ago

What do the semantics say?

nickbattle commented 8 years ago

(I'm not sure what the official SL specification says. What does VDMTools do? It's quite good with SL).

I think I have a fix for this bug in VDMJ. I should port to Overture fairly easily. Maybe over the weekend.

peterwvj commented 8 years ago

If you TC the spec below using VDMTools you get an error and no warnings:

Error List: 1Error, 0 Warnings
...\Dup.vdmsl, l. 8, c. 13:
  Operation already defined
operations

op: () ==> ()
op() == skip;

op: () ==> ()
op() == skip;

I thinks this is sensible.

ldcouto commented 8 years ago

I think we should do the same. At least for SL.

peterwvj commented 8 years ago

Yes, I agree. If Nick agrees too then I'll set up an issue suggesting that we make Overture do the same.

I should mention that if you do a similar thing for valuesyou only get a warning:

...\Dup.vdmsl, l. 5, c. 1:
  Warning[100] : "x" is multiple defined, and must have the same value
values

x = 1;
x = 2;

However, if you do this for type definitions then you get an error

...\Dup.vdmsl, l. 5, c. 9:
  Type already defined
types

N = nat;
N = char;
nickbattle commented 8 years ago

OK, by all means raise a separate issue about the SL errors - though it seems the requirement may be more subtle than we knew :)

I'm about to push a fix for this bug.

nickbattle commented 8 years ago

Hmmm... looks like there is still a problem with this. So I'll remove the mergable flag!

nickbattle commented 8 years ago

Right! I now think this is fixed, but it's been another "experience" of the subtle differences between VDMJ and Overture, AST differences and LexNameToken.equals threw me for a while... But now the original spec on this issue works correctly, as well as there being some updates to the test suite results.

I'll put the mergable flag back in place :)

ldcouto commented 8 years ago

Works correctly means type error right?

nickbattle commented 8 years ago

LOL, yes, as entered above the class B is abstract, so it cannot be instantiated in the Test class and this now gives a TC error. If you change the signature in B to override the abstract operation from A, then it is fine.