should there be two definitions of private within a module?

mppf commented 1 year ago

Today, the compiler uses two different definitions of private within a module:

one for private use/import
another one for private variables

Here is an example that demonstrates the problem:

// Example 1
module Library {
  var libraryVar: int; 
}
module M {
  use Library; // makes Library and libraryVar privately visible
  private var mVar : int; // makes mVar privately visible

  module WithinM {
    use M;

    proc main() {
      writeln(mVar);               // this compiles
      writeln(Library.libraryVar); // does not compile
      writeln(libraryVar);         // does not compile
    }
  }
}

I think this is a bug. But, if it is a bug, which is the definition that we want?

The two definitions differ in how they handle submodules:

private means that it cannot be used outside of the module, but a nested module is within its parent module, so should have access to private symbols
private means that it cannot be used outside of the module, and a nested module is a different module, so should not have access to the private symbols.

~At present,~ At the time this issue was created, the dyno scope resolver was using the 2nd definition for everything. However, we have some tests for the 1st definition (with variables) and I was looking at getting them working. (since then, the dyno scope resolver has been updated to match the production compiler in this way).

I think the spec could easily be clearer on this point, but my reading of it is that the combination of https://chapel-lang.org/docs/language/spec/modules.html#visibility-of-a-modules-symbols and https://chapel-lang.org/docs/language/spec/modules.html#nested-modules means that it is describing option 1

A nested module (or sub-module) is a module that is defined within another module, known as the outer, or parent, module.

A symbol’s visibility inside its module is controlled by normal lexical scoping and is not affected by its privacy-specifier.

This question is a reframing of this older issue:

15314

Some related examples that are perhaps off-topic in details below.

Here is a related example that does not compile today: ``` chapel // Example 2 module M { private var x : int; module Sub { use M; writeln(x); // is this access to x OK? } use Sub; } ``` But, there is a test (that compiles successfully) that is very similar: https://github.com/chapel-lang/chapel/blob/15a225ef2dff91c0c83de7c0bb126e780be110a7/test/visibility/private/variables/privateToParent.chpl#L1-L11 As a result, I am thinking that Example 2 should compile and the fact that it does not is simply a bug.

What could we do about this?

Option 1 / Definition 1: a submodule can work with a private use in a parent module (so libraryVar is OK in the example) just as it can work with other private symbols. This would require implementation changes.
Option 2 / Definition 2: a submodule can't access private contents of a parent module in any circumstance. This would require implementation changes and spec updates & be potentially breaking.
Option 3: keep the current implementation; where a submodule can access a private variable or method in a parent module, but can't work with private use / private import. This would require specification updates.

Note that changing from Option 3 to Option 1 can be breaking (see Example 6 in a comment below), and so can changing from Option 1 to Option 3 (since some cases in Example 1 would stop compiling).

bradcray commented 1 year ago

In the following, I'm guessing/hoping that changing uses to imports (of the symbols in question) above doesn't change the overall behavior? (or at least, hope it won't)

Of the two definitions, definition 1 seems more attractive and useful to me. Specifically, if we took definition 2, then I don't think there'd be any way for a parent module to make a symbol available to its children without making them available to everyone? (at least without introducing a new language concept). That seems like an attractive pattern to be able to support, and I don't see an obvious downside (since, arguably, it's reasonable for parent and child modules to be cognizant of one anothers' contents; and if one were worried about some sort of surprise or hijacking in this situation, they should use import rather than use anyway).

If we considered a definition 3 that treated private differently in these two cases ("the current behavior"), it seems like it could be rationalized as follows:

I should be able to see the symbols my parent module defines, whether they are public or private
I may not want to be subject to private use/importss within my parent module if they end up bringing in symbols that I didn't expect (that said, I'm hoping the shadow scoping rules would prevent us from getting into any ambiguity / hijacking situations if the used module added new symbols over time?
Since I have to use/import my parent module to get access to its symbols (is that right?), it may seem odd that I'd automatically get access to modules that it happened to privately use/import. I.e., it might be more consistent to say that I don't automatically inherit its private uses/imports since I don't automatically get access to its symbols without use/import

The main pause I have here is one that I think we've already discussed and decided on, which is: Given that the child module has this special relationship to their parent, should it even need to use a use/import to get access to the symbols?

mppf commented 1 year ago

In the following, I'm guessing/hoping that changing uses to imports (of the symbols in question) above doesn't change the overall behavior? (or at least, hope it won't)

Right.

``` chapel // Example 4 module Library { var libraryVar: int; } module M { import Library; // makes Library and libraryVar privately visible private var mVar : int; // makes mVar privately visible module WithinM { use M; proc main() { writeln(mVar); // this compiles writeln(Library.libraryVar); // does not compile } } } ``` ``` chapel // Example 5 module Library { var libraryVar: int; } module M { import Library.libraryVar; // makes Library and libraryVar privately visible private var mVar : int; // makes mVar privately visible module WithinM { use M; proc main() { writeln(mVar); // this compiles writeln(libraryVar); // does not compile } } } ```

lydia-duncan commented 1 year ago

The main pause I have here is one that I think we've already discussed and decided on, which is: Given that the child module has this special relationship to their parent, should it even need to use a use/import to get access to the symbols?

Yeah, we've already discussed and decided on the current behavior w.r.t. needing a use/import to access the parent's symbols

mppf commented 1 year ago

(Some breadcrumbs pointing to that discussion / decision are #15312 and https://github.com/Cray/chapel-private/issues/884)

bradcray commented 1 year ago

One other potential rationale for the status quo: If a child module can't see its parent module's private symbols, there's not much recourse. But if it can't see the symbols made available by its parent module's private uses and imports, it can repeat those private uses and imports itself.

The only downside to that that I can see is: What if a parent module had 10 private uses and 10 child modules and each of those child modules also needed those 10 private uses? Then I'd have to type 11 use statements naming 110 module names rather than just 1 module use statement naming 10. That said, I don't have a realistic case for this in mind.

Flipping this rationale around: If a child module automatically sees the symbols that were made available to its parent module via private uses and imports, but doesn't want to, it's sortof stuck seeing them? Whereas if it has to have its own uses and imports to get those symbols, then it is its choice whether it does or not?

mppf commented 1 year ago

One other potential rationale for the status quo: If a child module can't see its parent module's private symbols, there's not much recourse. But if it can't see the symbols made available by its parent module's private uses and imports, it can repeat those private uses and imports itself.

FWIW, I view this primarily as an argument against Option 2. Other than that, it's just pointing out that Option 3 is acceptable. I don't see it as an argument against Option 1.

The only downside to that that I can see is: What if a parent module had 10 private uses and 10 child modules and each of those child modules also needed those 10 private uses? Then I'd have to type 11 use statements naming 110 module names rather than just 1 module use statement naming 10. That said, I don't have a realistic case for this in mind.

Well, you could always factor those private uses into a helper module that public uses everything and that is in turn used by the child modules. (That would create different behavior if there is a symbol defined in multiple used modules though, FWIW, per #21157).

Flipping this rationale around: If a child module automatically sees the symbols that were made available to its parent module via private uses and imports, but doesn't want to, it's sortof stuck seeing them? Whereas if it has to have its own uses and imports to get those symbols, then it is its choice whether it does or not?

The child module doesn't have to use the parent module at all and it won't have access to these if it does not. Likewise, if it wants to bring in only certain symbol names or leave out certain symbol names, it can do that by adjusting the use / import of the parent module (e.g. with use only or use except).

My own opinion on this is that the language is more consistent and explainable with Option 1. In particular, in the dyno scope resolver, I actually had to add a special case in order to match the production compiler with Option 3. (Likewise, we would have to basically add a special case to the spec, in order to document it). However, I don't think that simple consistency is a particularly compelling argument. Arguments to do with what Chapel programmers can do / will do with the feature hold more weight. However, so far I haven't found any arguments in that vein that I find particularly convincing either.

bradcray commented 1 year ago

The child module doesn't have to use the parent module at all and it won't have access to these if it does not. Likewise, if it wants to bring in only certain symbol names or leave out certain symbol names, it can do that by adjusting the use / import of the parent module (e.g. with use only or use except).

That's a great point that didn't occur to me last night when writing this.

I can live with option 1 or option 3. If I were to argue in favor of option 3 (*), it would be that when I'm writing code in a child module that uses its parent, the symbols in that parent module are pretty evident, whether private or public. Whereas once I start also inheriting the symbols made available via private use, suddenly what I'm getting is less obvious/evident (unless the private use includes an only clause...).

Of course, this could also be viewed as a general argument against using use rather than import (or use with only) within the parent and/or child module, so it's really not the strongest argument. So back to the start, I can live with option 1 or option 3.

If I'm thinking about it right, I think that switching from either option to the other would represent a breaking change post-2.0, so neither is particularly more conservative or safer than the other...?

= I really should've called it option 0...

mppf commented 1 year ago

If I'm thinking about it right, I think that switching from either option to the other would represent a breaking change post-2.0, so neither is particularly more conservative or safer than the other...?

I'm not aware of a program that would break if we went with updating the spec to document Option 3 in the near term & then changed to Option 1 in the future, say, post-2.0. Are you?

(For me though, the consistency argument wins over the "non-breaking change later" argument, especially since the implementation is basically already done in the dyno scope resolver).

bradcray commented 1 year ago

I was imagining that if we switched from option 3 to option 1 and:

the parent module contained a private use of a module M
and as a child module I used my parent module P that I could get an ambiguity error between a symbol defined in my scope and one defined in the M's scope. Though for most symbols, I suppose mine would win because they'd shadow M's. Do we no longer have any cases where a new overload routine introduced through such a 'private use' would change the behavior if it shared a name with overloads in my scope? (I can never remember where all that landed). If not, then I agree with you.

In any case, I can still live with option 1 or option 3.

mppf commented 1 year ago

Ah, that is an excellent point.

// Example 6
module Type {
  record R { }
}
module Library {
  use Type;
  proc R.method() { }
}
module M {
  public use Type;
  use Library; // R.method

  module WithinM {
    use M;

    proc R.method() { }

    proc main() {
      var x: R;
      x.method(); // works today but becomes ambiguity with Option 1
    }
  }
}

Related to this example, we aren't using a "more visible" rule in disambiguation for methods anymore, but would a non-method function in M be more visible inside of WithinM than one in Library? That would mean that the "flat bill of sale" (from PR #19306) idea does not apply to nested modules. But, I think that could be OK, because we could view the use M here as enabling the scope lookup to traverse to the parent module.

This would get really weird if WithinM had a public use M though (but I can't think of any reason we couldn't simply make that an error -- re-exporting a parent module has a kind of circularity to it and I don't know of a use case).

Here is an example of how the "flat bill of sale" will not apply, but this does work today with the dyno scope resolver if I have it using Option 1 (and it works the same as with Option 3). (Which is evidence that allow it will not present an undue burden).

// Example 7
module Library {
  var x: int;
}
module M {
  use Library;
  private var x: real;

  module WithinM {
    use M;

    proc main() {
      writeln(x);
    }
  }
}

lydia-duncan commented 1 year ago

It doesn't look like this was tracked in the spreadsheet of language design tasks (possibly due to being opened later) and it doesn't seem like we have a full decision here (though I think we decided we're not doing option 2). If I'm reading this discussion right, it seems like:

The spec describes option 1
The production compiler (and as a result of #22324, the dyno scope resolver) implements option 3.

So we can either:

update the spec (which won't break user code) or
update the compiler (which will, but could maybe be considered a bug fix?)

Does that mean it's okay to remove the 2.0 tag?

mppf commented 1 year ago

This is corner-case enough that I think we could claim it is a docs bug fix (if we update the spec for Option 3) or a compiler bug fix (if we update the compiler for Option 1).

bradcray commented 1 year ago

I don't object to removing the label, but just to understand where we are in hopes of not having this issue hanging over us forever:

Michael, the OP says that dyno implements the second definition, which seems to correspond to option 2. Is that still the case today, or did anything change about dyno since we filed this? And: now that dyno scope resolution is used, is it the case that we have this behavior on main today? Or is the work you mention doing to implement option 3 in dyno later in the comment stream what happens on main today? (I'm guessing this is the case, re-reading Lydia's comment).
Do either of you have a preference as to what should happen here? How strong is that preference? (I'm trying to figure that out for myself...)

lydia-duncan commented 1 year ago

I have a slight preference for matching the spec, but I'm okay with updating it instead.

bradcray commented 1 year ago

That's option 1?

mppf commented 1 year ago

Michael, the OP says that dyno implements the second definition, which seems to correspond to option 2. Is that still the case today, or did anything change about dyno since we filed this? And: now that dyno scope resolution is used, is it the case that we have this behavior on main today? Or is the work you mention doing to implement option 3 in dyno later in the comment stream what happens on main today? (I'm guessing this is the case, re-reading Lydia's comment).

Right, we changed the dyno implementation to match the production compiler in this regard. I clarified in the OP for the future.

Do either of you have a preference as to what should happen here? How strong is that preference? (I'm trying to figure that out for myself...)

From my other comments, the past-me had a preference for Option 1. Today, I don't have any strong preference here.

I have a slight preference for matching the spec, but I'm okay with updating it instead.

That's option 1?

Yes I'm pretty sure that "matching the spec" means Option 1.

mppf commented 1 year ago

I added the "bug" label on the theory that the current situation represents either a docs bug (if we go with Option 3) or a compiler bug (if we go with Option 1).

mppf commented 2 months ago

I am looking at some dyno scope resolver issues (mainly #25551, but I am also keeping in mind if we can address current performance gap we have with --dyno-scope-bundled). This issue relates to what the dyno scope resolver needs to handle.

At present, my preference is to leave the implementation as Option 3 and update the spec. I see the primary advantage of doing this as minimizing changes now that we are stable. In particular, it reduces the risk of some surprising consequence to switching to Option 1, since we have been living with Option 3.

It seems that we have been able to argue both for Option 1 and Option 3 in the discussion above, and IMO the main conclusion was that we view Option 1 and Option 3 as both acceptable. I think that the stability property of preserving the implemented behavior is more important than any of the arguments we have for Option 1 or for Option 3.

chapel-lang / chapel

should there be two definitions of private within a module? #21723

15314