should grace have self (and outer) requests?

gracelang / language

Design of the Grace language and its libraries

GNU General Public License v2.0

6 stars 1 forks source link

should grace have self (and outer) requests? #140

Closed kjx closed 6 years ago

kjx commented 6 years ago

How can confidential methods be called? Where should the run method crash? (assuming it should crash)

class a {
    method c is confidential {
        print "c"
    }
    method run {
        c            // implicit receiver request 
        self.c     // self request
        def x = self 
        x.c        // explicit receiver request
    }
}
a.run

Both Kernan & AMG allow crash only on the explicit receiver request: 'self' (and 'outer') are special request (so we currently have four kinds of requests: implicit, explicit, self, and outer). Newspeak needs self and outer requests because it is permissible to have the same name shadowed in lexical scope and inherited - so implicit requests have to be disambiguated. If Grace is serious about banning ambiguous programs, it seems to me we don't need special 'self' and 'outer' requests.

Removing these requests would mean that you could still e.g. write 'self.foo' or 'outer.foo' --- its just that would only work if 'foo' was public. Confidential methods could only be called by implicit requests.

kjx commented 6 years ago

see also #102

apblack commented 6 years ago

I think that we have three kinds of request:

Receiverless requests (aka implicit requests); these always target one of the current selfs — self, outer, outer.outer, etc. Receiver less requests can be (and are!) rewritten (think of Eelco's "lowering") to kind 2.
Yourself requests, which have an explicit self, outer, outer.outer, etc.
Targeted requests, which have en explicit receiver of some other syntactic kind.

And yes, it is the syntax of the request that determines whether or not it is a self request, not the current value of a variable.

Ideally, the above program should break at compile time, but if not, it should break when c is requested on x.

My current syntax has 4 syntactic categories for request:

Request
    : RequestWithArguments 
    | SelfRequest
    | OuterRequest
    | TargetedRequest
    ;

but there is a fifth one — a plain identifier — that does not show up here, because it is already allowed in another part of the grammar for expressions. Having to separate (in the syntax) requests like foo from requests like foo 3 is one of the less-pleasant consequences of our design.

I'm not clear what this issue is. I think that we all agree that we need self requests and outer requests.

kjx commented 6 years ago

I think that we have three kinds of request:

Yes, that's a better way to describe what the underlying issue is. How many kinds of requests, what kinds of requests, what we call them.

I think that we all agree that we need self requests and outer requests.

I used to think so. Now I'm really not so sure. We got rid of super requests - can we get rid of more? I am attracted by the simplicity of a rule saying confidential declarations can only be accessed by receiverless requests. At that point self and outer would be special definitions, but not special requests: writing self.foo would have the same semantics as def x = self; self.foo

What this simplification loses is the possibility of teaching Grace where receivers are always syntactically explicit, so e.g you'd always write self.foo instead of foo whenever self.foo is legal.

apblack commented 6 years ago

I am attracted by the simplicity of a rule saying confidential declarations can only be accessed by receiver less requests.

I don't think that this is simpler. Instead, it confuses two separable things: the shorthand syntax with the receiverless request, and the rule on confidentiality.

As things stand, we can (and I do, when I teach) explain the receiver less request as a shorthand for a request that starts with self. or outer. We can expand that shorthand with no change in semantics: I think that's important.

Then, there is the rule for confidential attributes: they can be accessed only from the inside. These two things can be explained and understood separately.

In addition, your rule doesn't work, because sometimes one needs to say outer explicitly. There were cases in the collections library where, because the outer object and the inner object had to implement the same interface, the outer was unavoidable.

kjx commented 6 years ago

I don't think that this is simpler. Instead, it confuses two separable things

separable, sure. The question is whether or not to separate them. We unified super-requests and receiverless requests via aliasing. I now think we have the option to get rid of the two remaining special cases --- explicit outer and explicit self --- and I still think that getting rid of two of four kinds of requests makes the language simpler. Similarly, I think that making the syntactic request form (receiverless vs "receiverfull") directly determine the encapsulation semantics is simpler than having a more complex rule.

We can formulate the current rule as something like:

"confidential methods can be requested either by internal requests, or by external requests on the keywords self or outer"**

(trying out "internal" & "external" rather than "receiverless"/"implicit" and "explicit").

The question is whether we want the second clause in the disjunction. If we don't need it, should we keep it? I know we're used to Java and Smalltalk, which obviously keeps the disjunction. Eiffel, however does not - from the Eiffel ECMA-367 standard:

An Object_call appearing in a class C, with fname as the feature of the call, is export-valid for C if and only if it satisfies the following conditions. 1 fname is the final name of a feature of the target type of the call. 2 If the call is qualified, that feature is available to C... As a consequence s (...) might be permitted and x.s (...) invalid, even if x is Current.

Then, if you're willing to adopt the Eiffel rule, and let go the ability to write redundant outer or self, there's a practical question of whether you ever need them.

There were cases in the collections library where, because the outer object and the inner object had to implement the same interface, the outer was unavoidable.

sure. but if the outer object is implementing an interface, those methods will be public, so you can write the explicit send outer.foo and get what you want. I still want to keep self and outer, but just remove the special cases. I don't think this reduces expressivity, because I think you can (almost*) always rename confidential methods in surrounding lexical scopes: that seems a lexical analogue to aliasing.

We can expand that shorthand with no change in semantics: I think that's important.

yep. that's the choice. if keeping the expansion really is that important, we can keep the current design, the language has to have four different kinds of requests - we should be explicit about them and get the wording right about all the corner cases etc.

The Eiffel rule will highlight the asymmetry in visibility between methods (default public) and fields (default confidential). With the Eiffel rule, programs with no annotations can get away with writing self.m for any unannotated method m, but any unannotated field f will have to be accessed only asf --- just like local variables (even in Smalltalk you can't write thisContext.f) . So perhaps I'd argue yes there is an asymmetry still there but it just moves around slightly...

*"almost" because I think there are might be situations with inheritance where you have internal protocols based on confidential methods where you cannot rename those methods because they are called by the lexically-outer-object's-superclass. But - if you're making outer calls I think you can always add an aux method in the outer scope that requests the confidential method implicitly on its self. There may also be problems with subclasses of intermediate objects... but I think the current manifest requirements rule out those cases. Or not.

*"almost" also if you have confidential methods in the dialect, which you probably don't want to modify, and probably can't because modules are fixed objects.

**Plus, the Eiffel design avoids nasty corner cases like (((self))).c which in NS, amg & Kernan is resolved like self.c --- but based on the spec, I'm not quite sure why. Whatever the full rule is, it must be quite subtle.

KimBruce commented 6 years ago

My sympathies are with Andrew’s position on this. If we had to give up one of the three kinds of requests (and I hope we don’t), I’d rather give up the receiverless one and make the user always explicitly write self or outer. Self is tricky (as shows up in your original example) because giving it another name changes its type (implicit or explicit). It’s rare that you would see something like def c = self, but there are lots of examples where you might want to send self as an argument: jim.makeFriends(self), and the formal parameter for makeFriends only has the public interface of self.

In general I wanted to ban as much of the possible ambiguity as possible and so not allow shadowing of any identifiers from outer scopes. However, I believe we had problems with that and the solution was to require the use of self or outer to disambiguate the uses of the identifiers (though I don’t know if that ever made it into the reference).

Kim

On Jan 8, 2018, at 11:15 PM, kjx notifications@github.com wrote:

I don't think that this is simpler. Instead, it confuses two separable things

separable, sure. The question is whether or not to separate them. We unified super-requests and receiverless requests via aliasing. I now think we have the option to get rid of the two remaining special cases --- explicit outer and explicit self --- and I still think that getting rid of two of four kinds of requests makes the language simpler. Similarly, I think that making the syntactic request form (receiverless vs "receiverfull") directly determine the encapsulation semantics is simpler than having a more complex rule.

We can formulate the current rule as something like:

"confidential methods can be requested either by internal requests, or by external requests on the keywords self or outer"**

(trying out "internal" & "external" rather than "receiverless"/"implicit" and "explicit").

The question is whether we want the second clause in the disjunction. If we don't need it, should we keep it? I know we're used to Java and Smalltalk, which obviously keeps the disjunction. Eiffel, however does not - from the Eiffel ECMA-367 standard:

An Object_call appearing in a class C, with fname as the feature of the call, is export-valid for C if and only if it satisfies the following conditions. 1 fname is the final name of a feature of the target type of the call. 2 If the call is qualified, that feature is available to C... As a consequence s (...) might be permitted and x.s (...) invalid, even if x is Current.

Then, if you're willing to adopt the Eiffel rule, and let go the ability to write redundant outer or self, there's a practical question of whether you ever need them.

There were cases in the collections library where, because the outer object and the inner object had to implement the same interface, the outer was unavoidable.

sure. but if the outer object is implementing an interface, those methods will be public, so you can write the explicit send outer.foo and get what you want. I still want to keep self and outer, but just remove the special cases. I don't think this reduces expressivity, because I think you can (almost*) always rename confidential methods in surrounding lexical scopes: that seems a lexical analogue to aliasing.

We can expand that shorthand with no change in semantics: I think that's important.

yep. that's the choice. if keeping the expansion really is that important, we can keep the current design, the language has to have four different kinds of requests - we should be explicit about them and get the wording right about all the corner cases etc.

The Eiffel rule will highlight the asymmetry in visibility between methods (default public) and fields (default confidential). With the Eiffel rule, programs with no annotations can get away with writing self.m for any unannotated method m, but any unannotated field f will have to be accessed only asf --- just like local variables (even in Smalltalk you can't write thisContext.f) . So perhaps I'd argue yes there is an asymmetry still there but it just moves around slightly...

*"almost" because I think there are might be situations with inheritance where you have internal protocols based on confidential methods where you cannot rename those methods because they are called by the lexically-outer-object's-superclass. But - if you're making outer calls I think you can always add an aux method in the outer scope that requests the confidential method implicitly on its self. There may also be problems with subclasses of intermediate objects... but I think the current manifest requirements rule out those cases. Or not.

*"almost" also if you have confidential methods in the dialect, which you probably don't want to modify, and probably can't because modules are fixed objects.

**Plus, the Eiffel design avoids nasty corner cases like (((self))).c which in NS, amg & Kernan is resolved like self.c --- but based on the spec, I'm not quite sure why. Whatever the full rule is, it must be quite subtle.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/gracelang/language/issues/140#issuecomment-356199701, or mute the thread https://github.com/notifications/unsubscribe-auth/ABuh-hj-d7nIe_pirQ9MlegF0Rd_XjLQks5tIxIFgaJpZM4RH5js.

kjx commented 6 years ago

If we had to give up one of the three kinds of requests

I think we currently have four kinds of requests: internal, external, self, outer (down from 5 or 6 - no more super requests or dialect requests).

(and I hope we don’t)

We don't have to - but Eiffel has just two kinds of requests. I increasingly think we can too.

make the user always explicitly write self or outer

outer.outer.outer.for( self.contents ) do { each -> ... }

Making outer explicit I think undoes the design of dialects - although suppose you could say that you don't need to be explicit outers for stuff in dialects, but you do for other enclosing scopes. Making self explicit isn't so bad: but it makes code, particularly code manipulating variables, more ugly, which is why Eiffel & Self supported receiverless syntax.

Self is tricky (as shows up in your original example) because giving it another name changes its type (implicit or explicit).

Actually with the Eiffel rule, self always has the same type: the externally visible type. If you assign self to a def, you get exactly the same access via that def (or via any other reference) than you do via "self" - you get the public interface because you're requesting methods on an explicit receiver either way. It's just that with the Eiffel rule, the only way you get access to the confidential parts of an object is to not write self - and that's counterintuitive to Java or Smalltalk programmers, and also means you can't use explicit selfs or outers where confidential declarations are being requested.

I wanted to ban as much of the possible ambiguity as possible and so not allow shadowing of any identifiers from outer scopes.

What I realised is that the more ambiguity we can remove, the less we need to be able to disambiguate. If everything was unambiguous, we would never need to disambiguate. As it is, for most cases, I'm fairly sure that a combination of aliases and local abstract method declarations can sort stuff from superclasses, and adding extra forwarding declarations (I don't particularly want to add aliases into objects but we could - see #144 which discusses this but also has a bit of a worked example.

The other issue - once again - is the extent to which we want to restrict ambiguity in the core language semantics, vs handing that responsibility off to a dialect. If it was in a dialect, then the underlying semantics could e.g. have a disambiguation rule (probably have to be lexical first, out then up) but dialects could more-or-less ensure that rule would never be invoked...

kjx commented 6 years ago

So, going back to the example at the top - https://github.com/gracelang/language/issues/140#issue-283448685 - I just realised there's a third option which gets rid of outer altogether, gets rid of self requests being special, and resolves a longstanding problem about naming enclosing objects: use semantics rather than syntax to permit calls to confidential attributes.

The idea is that any call on the current 'self' object, or on an actually enclosing object may access confidential attributes. In terms of the example above, all the calls would be permitted. The semantics are clear - you just look for the receiver up the sender's lexical chain. An implementation would optimise this with exactly the semantics minigrace (say) has now - but where the existing check would fail, the implementation would additionally do the lexical lookup dynamically if necessary - and since most of these names would be manifest, a dynamic lookup would rarely be necessary.

What this buys us is that programmers can just make names for objects, either external or internally, doing nothing but defs, no special syntax for object names or whatever --- and then calls on those def-names work as if they were outer sends.

def external_name = object { 
   def internal_name = self
   method c is confidential { "c" } 
   class test { 
       method run {
           c // implicit call - works
           internal_name.c // works! 
           external_name.c // works!!  
  }

No need for outer, or outer.outer. No need for self to be anything other than a pseudo variable. Simpler semantics and simpler programs.

apblack commented 6 years ago

Once again: simpler for the dynamic semantics. More complicated for the programmer and for the implementer and for program analysis. Which do we choose?

To make this work we would also have to get rid of the rules that restrict what can be inherited, because most objects would have to be put in a def so that they could be given a name, and we can't presently inherit from a def. Of course, that's just one more place where trying to restrict the language to make programs easier to understand and implement has made the dynamic semantics more complicated.

Thinking about it, isn't lexical scope (rather than dynamic scope) in Lisp yet another example?

kjx commented 6 years ago

Once again: simpler for the dynamic semantics.

Actually, no --- I think the dynamic/semantic option is the most complex for the dynamic semantics too. In a few days I should be able to do show this properly, but for now:

The Eiffel option is purely syntactic - its the simplest thing that can probably work: receiverless requests always do lexical dispatch (and can access confidential); recieverfull requests always do external dispatch (and cannot access confidential attributes). "Confidential attributes can only be accessed from receiverless requests" (9 words)
the self/outer option is also syntactic - the same as Eiffel, but with two exceptions -- basically a recieeverfull requests on self does an "inheritance-only" lookup that can access confidential attributes; a receiverfull request on outer does an "inheritance-only" lookup to that object, which again can access confidential attributes. "Confidential attributes can be accessed from receiverless, self, or outer requests" (11 words)
the semantic option is, well, semantic. This option ignores the request syntax. For every request of a confidential attribute, you have to dynamically calculate the relationship between receiver and requester and work out if the requestor is "lexically inside" the receiver. This seems the most complex to me. "Confidential attributes can only be requested from lexically inside the object to which they belong" (15 words :-)

More complicated for the programmer and for the implementer and for program analysis.

More complex for implementer and for analysis sure - any currently failing lookup now has another search to do. I'm not convinced it is more complex for the programmer and am beginning to convince myself that it is simpler than the self/outer semantics, but not as simple is the Eiffel semantics - this of course is a matter of opinion!

Which do we choose?

Ideally the programmer; in practice we have to make tradeoffs especially with respect to likely implementation technologies. Of course, as you said earlier, "who is the programmer"? - and when we say simpler, do we mean simpler for the programmer to understand, or allowing programmers to write simpler or shorter programs. I'm leaning towards the idea that being able to name enclosing objects directly, access them via those names (with self as a special case, and outer going away) wins on balance overall.

To make this work we would also have to get rid of the rules that restrict what can be inherited,

We could get rid of those rules (or some of 'em), perhaps we should, but we don't need to for this to work.

because most objects would have t be put in a def so that they could be given a name, and we can't presently inherit form a def.

We don't need to - because a def that names a object can be inside the object or class definition it's naming.

class listSlice(backingList,start,limit) {
  def listSelf = self // this is already legal, simple, straightforward
  method confidentialAt(x) is confidential { backingList.at(x) }   
  class  iterator {
      var cursor = start
      method next { listSelf.confidentialAt( cursor ) }  
          //only the semantic option permits this call to access a confidential attribute
  }
 }

You can write code like example code today - I've done it a few times - but you cannot access confidential features via a receiverfull request (to listSelf) in the Eiffel or self/outer is special options. This is what's currently attracting me to this option - we get something we've often said we wanted - object names - with zero additional syntax, by writing "natural" code that programmers may expect to work anyway!

What happens if a subclass overrides listSelf? Well, the code will probably break - but there's no access granted that wouldn't be otherwise, because access depends on the semantic relationship between actual objects, not the syntax used to access them. Yes, any checker or analysis would be conservative - sound but not complete --- like Grace main typechecker, so I don't see it being that much of a problem.

Of course, that's just one more place where trying to restrict the language to make programs easier to understand and implement has made the dynamic semantics more complicated.

I'll agree inheritance restrictions make inheritance easier to implement. Understanding - depends who you are. Depends on the cost of understanding the full generality of something - or perhaps just the cost of being able to work things out from a few simple rules - vs having to remember what the restrictions are. For example, I understand how requests work, and how objects are composed via use and inherits clause requests --- but I still don't understand the precise restrictions that make things manifest.

Thinking about it, isn't lexical scope (rather than dynamic scope) in Lisp yet another example?

Dunno. Our creation semantics are pretty easy to describe operationally given dynamic scope or one thread local variable. I have a sneaking suspicion that manifest probably falls into the same category.

kjx commented 6 years ago

Earlier https://github.com/gracelang/language/issues/140#issuecomment-357490188 I said:

In a few days I should be able to do show this properly, but for now:

So, here's how the current version of inheritator2 would encode these options. This code extends handling external (receiverfull) requests, checking if the access is allowed --- internal (recieverless) requests are not checked because they are always permitted.

Eiffel (purely syntax) version:

if (!methodBody.isPublic) then {error "External request for confidential attribute {name}"}

Self/outer syntactic version - adds an auxiliary clause for "special" (self and outer) requests which grant access to confidential attributes, and then defines those requests syntactically as the receiver being the strings self or outer. Those variables are already defined elsewhere so the receiver will be right: the question is purely about encapsulation.
```
if (! (methodBody.isPublic || isSpecialRequest)  ) 
then {error "External request for confidential attribute {name}"}
def isSpecialRequest = 
ImplicitRequestNodeBrand.match(receiver).andAlso {
(receiver.name == "self") || (receiver.name == "outer") }
```

Semantic version - keeps the auxiliary clause but defines special requests as requests to objects that lexically enclose self.

if (! (methodBody.isPublic || isSpecialRequest)  ) 
then {error "External request for confidential attribute {name}"}
def isSpecialRequest = rcvr.lexicallyEncloses(mySelf)

of course, lexicallyEncloses had to be defined in the runtime object model:

class topContext {
method lexicallyEncloses(other) {self == other}
...
} 
class lexicalContext(ctxt) {
method lexicallyEncloses(other) {(self == other) || ctxt.lexicallyEncloses(other)}
...
}

Now, this is all quite straightforward: it's at most 5 lines on top of an interpreter core currently around 1200 lines - although the lines are more subtle than most*. But I hope, perhaps, this example two things:

lets us talk more precisely about the competing definitions of the encapsulation rules
shows that the Eiffel option is conceptually "simplest"; the current option next; the semantic option the most complex - we had to add a new relation into the semantic object model.

The question then is what makes the language simpler overall . Both semantic and Eiffel options mean we could completely get rid of outer, and self loses its special case encapsulation semantics. The Eiffel option does this by restricting expressivity (good for novices?) while the semantic option makes the language more expressive.

My model is intentionally inefficient: but I think something like Moth would have no problem optimising outer access in the semantic option; something like minigrace could just check at compile time if receivers were manifestly naming outer objects.

*more subtle e.g. than the code defining primitives or interfacing with the AST; rather less subtle than the inheritance / lookup code - currently 120 lines (but should be about 60) that has taken me a week to write so far...

kjx commented 6 years ago

another example of "kjx's useless, distracting, and pointless F**king around" as we put it in the discussion this morning.

We'll keep self as it is , and outer, and outer.outer, and outer.outer.outer