adamsol / Pyxell

Multi-paradigm programming language compiled to C++, written in Python.
MIT License
54 stars 6 forks source link

Super #3

Closed skaller closed 3 years ago

skaller commented 3 years ago

"Inside the method body, you can call the corresponding method of the parent class using super keyword."

What happens if your base and derived class have methods f and g, and in the definition of f in the derived class you want to call the base class method g? AFAIKS there is no way to do this. In fact i would suggest outlawing calling base class methods which have been overridden altogether. The only case where you might need to do this are in constructors.

adamsol commented 3 years ago

Yes, there is no way to do this, since I've never found such behaviour useful and for me it only complicates the code. However, calling the same method of the base class is in my opinion very useful (I use it often at work, in Python and in JS). As for constructors, super is already not allowed there, since base class constructors are called automatically.

skaller commented 3 years ago

I take you point on constructors, automatic calling the base constructors is reasonable. However the correct solution to the problem I posed also applies to calling the method you're overriding which is extremely bad practice and theoretically unsound. You may use that method frequently but Python and JS are both junk languages so it hardly counts as evidence.

The correct technique is to disallow calling overridable base methods. if you want to do this the base method should have been factored into a non-overridable helper method which the base method calls, and which its override in the derived class can also call.

In C++, the general advice on virtual functions being public is utterly and completely wrong. In fact, the correct access mode for C++ virtuals doesn't exist: they should be private in the base, so derived classes cannot call them, but even more, the overrides in a derived class should be totally invisible. Same solution as I mentioned before: if you have code you want to call in your derived class that implements the derived behaviour, define a helper and have your virtual call that, and then you can call it directly from the derived class only, statically, without any virtual dispatch. Virtuals are implementation details that depends on the derived class implementation details and should only be callable at the level where they're introduced, and never by the public.

Hmm. I think I just argued you should get rid of super altogether :-)

adamsol commented 3 years ago

That's some radical approach. I kinda like the idea of extending methods naturally with super. Of course, you can define helper methods, but I think that would cumbersome to do in all cases, and would be not possible at all when extending an external class.

Can you give some specific arguments about what are the problems with the current solution?

skaller commented 3 years ago

It's not radical, it followed immediately from considerations of pre and post conditions and invariants. Objects have two kinds of invariants: those relating to the published, public, semantics, and those relating to the internal, publically invisible representation.

The public invariants are exclusively categorical if you have a complete private representation, in other words they related to the interactions of the functions of the API only. Just for example, with a stack this is an invariant, using reverse polish application syntax in the OO style:

push.pop = identity

in other words if you push something onto the stack them pop something off it, you're back to the original stack. The invariant is how the semantics are specified without mentioning any values, just using functions. That's category theory.

You can also have representation invariants. For example a representation of rational numbers typically has the form P, Q, where P is any integer, and Q is a positive integer which does not divide into P. So the form 4,2 is not allowed because 2 divides into 4: P and Q have to be relatively prime.

Public constructors accept ANY input, and either establish the representation invariant, or, if they can't, they bug out the program with rude error message. Subsequently, all API calls must preserve the public invariants.

On the other hand, public methods can call protected or private methods to do their work but they must never call public methods. Public methods are for the public only. The reason is that public methods must preserve both the public invariants and representation invariants, if necessary by aborting the program if it is not possible. But protected methods do not have to preserve public invariants because the public cannot call them. They must, however, preserve representation invariants, because an unknown method in a derived class is allowed to call them.

Private methods don't have to preserve anything. The reason is they can only be called by a fixed known set of methods which cannot be extended, so the programmer can do anything, provided the result of some sequence of private calls preserves the invariants the caller must.

Preserving invariants is more than just ensuring the state on exit obeys rules. It is even more important to understand it means that the method can assume the invariant is established on entry, so no checking is required; in fact, any checking would be a waste of time and space because the checks are certain to pass. This is why private methods, for example must not call public methods, because the public method tries to establish an invariant, which private methods do not require.

The rules are very strict for good design. Methods can only call methods of the same or weaker protection level, because all methods assume an invariant is established on entry, and must ensure that the invariant is established on exit.

Virtual methods must be private and they should not call each other. They can call private helper methods or directly fiddle the representation, but they do not have to worry about invariants, so long as the methods that call them establish the appropriate invariants (none for private, representation for protected, and categorical for public). In turn, this means their overrides, which fiddle representation details, must never be called at all. Only the base class is allowed to call them by virtual dispatch. This is because the base class establish the semantics of virtuals it introduces and those semantics (invariants) could be destroyed in a derived class if derived class methods could call the locally defined override directly.

The point is that when a derived class method calls a virtual it does not actually know its own overriding method is called because it could be overridden again in a class that is derived from it! The point is that no one knows which function is called by a virtual dispatch!! The only allowed way to call a virtual is via a non-virtual in the base class which introduces it and delegates to the virtual, otherwise there is no way to ensure the rules about preserving invariants are obeyed.

And that's why super is wrong, except in constructors, because that is the only time when you know the complete class, because you're currently extending a complete base into a complete derived type. (In fact this is only true with single inheritance I think, because of virtual bases, you cannot actually tell where a virtual dispatch will go). I also note this is why final was introduced in Java, and later in C++, because programmers do not understand the rules, and do not write correct code. final is a design fault which allows the programmer to know the actual implementation which is called. It is also why C++ design is good in the sense that only a fixed set of known methods are virtual, and others non-virtual. Languages which allow any method to be overridden are broken. It cannot work. Languages without protected access are also broken and cannot work. Bjarne was wrong, claiming protected was a mistake, totally wrong. In fact, the protection levels are not strong enough to ensure overridden virtual cannot be called, and that is a requirement.

i know this is a complicated argument, but it follows from basic considerations of the invariants methods of various protection levels maintain, where the protection levels are used to establish who can call what.

Everyone cheats. Even I cheat. Only the Felix garbage collector strictly follows these rules, because it is doing a reasonably complex job, and there are two versions: a single threaded one and a thread safe one: the abstraction is the same for both. The implementation of the thread safe version has to establish thread safety invariants that the non-thread safe one doesn't, but the algorithms involved are complex enough I wanted to reuse the non-thread safe code in the thread safe one. Without following the rules rigidly there's no way I could get it right. It's really important not to try to lock something that is already locked!!!!

adamsol commented 3 years ago

Languages which allow any method to be overridden are broken. It cannot work. Languages without protected access are also broken and cannot work.

Well, this is not true, since Python works. You have the right to claim this is a "junk" language and that so is Pyxell, but then I don't understand why you went to the trouble of writing it all down. I'm concentrating on practicality and simplicity of my language, not some academical theorization.

skaller commented 3 years ago

Pyxell isn't a junk language, Python is. Difference: static typing.

adamsol commented 3 years ago

But still, in Pyxell all methods are virtual, and there are no class field visibility modifiers, so according to what you've said, it cannot work. However, my reasoning is that static typing is necessary for compilation to machine code, while class visibility modifiers are not -- it can easily remain just a convention, like in Python.

skaller commented 3 years ago

Right, it cannot work, meaning, the programmer cannot enforce the maintenance of invariants. Virtual methods are abstract data and should be treated exactly the same way data is to enforce encapsulation: access is only allowed from the base, the same as for concrete data.