chapel-lang / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
1.78k stars 419 forks source link

Should an explicitly public use override the declared privacy of a module? #13528

Open lydia-duncan opened 5 years ago

lydia-duncan commented 5 years ago

This issue came about as a question of how to enable re-export patterns and #13979 offers some context there.


Michael mentioned in issue 13524 that he would expect publicly using a private module to enable its symbols to be publicly available, but that isn't the case today. Should it be?

lydia-duncan commented 5 years ago

I'm of two minds on this myself.

On the one hand, it seems like the private applied to a symbol itself should be respected with precedence over the public-ness of symbols or statements that utilize it, due to it being more specifically applied. A public use shouldn't cause a private use to be flipped into a public one. Why should a public use cause a private module to be exposed publicly?

On the other hand, private modules can usually only be used in the context in which they are defined (i.e. their parent scope), which means that the scope of the public use is the same as the scope where the private module exists - if any place should have control over whether a symbol is publicly available, it should be the scope in which it was defined. Publicly using a private module would otherwise be meaningless - it wouldn't impact anything.

lydia-duncan commented 5 years ago

Additionally, if #13524 becomes implemented, the situation becomes more complicated.

mppf commented 5 years ago

I got this idea from Rust, where they talk about re-exporting things from submodules. But I might have totally confused some elements...

mppf commented 5 years ago

Another detail is that I think this only could make sense for making (part of) a submodule public. (Vs making private things in a use'd module public, say).

lydia-duncan commented 5 years ago

Another detail is that I think this only could make sense for making (part of) a submodule public. (Vs making private things in a use'd module public, say).

That seems reasonable. There will probably be users that view "private module" as meaning they don't have label everything in the module as private, though - in a world where private modules can be publicly used, we may want to be more explicit about that impact so users don't fall into that trap.

BryantLam commented 5 years ago

(I deleted my previous post because I got slightly confused with my rationale.)

Both Chapel and Rust's use statement do two things, though slightly different on the second thing. The Rust use is overloaded to do two things: (1) allow the current scope to use those items (same as Chapel); i.e., import the symbols into scope and (2) re-export that item to clients of the current module; Rust's use will straight-up re-export whatever it is you give it, even if it's another module. (code):

// Rust
pub mod m1 {
    pub use crate::m2; // [[2]] Symlink m2 here. The actual m2.
    pub use crate::m3::*; // Symlink all items inside m3.
    // Anything in m1 is also able to directly
    // access `m2::fn2` and `fn3` at this point.
}

pub mod m2 {
    pub fn fn2() { println!("fn2 from m2"); } // [[1]]
}

pub mod m3 {
    pub fn fn3() { println!("fn2 from m3"); }
}

fn main() {
    m1::m2::fn2();
    m1::fn3();
}

In Rust's module system, use is also an item like functions, modules, variables, etc. and has its own visibility, just like these items do.

  1. If you remove pub from line [[1]], then line [[2]] won't be able to "pub use" that function.
  2. If you remove pub from line [[2]], then items inside m1 will still be able to see the imported symbols in m1, but clients of m1 (e.g., main) won't be able to because it was not re-exported. That item (the use itself) is now private.

One of the older Rust proposals actually suggested removing the re-exporting feature and make use only import symbols into the current scope (and not expose them to clients). This is like Chapel's private use. The Rust community didn't like that proposal for some reason so it didn't get adopted. The proposal is under "Use statements" and specifically notes that these import-only use statements wouldn't have a privacy specifier because they're not treated as items that are visible/accessible to clients, only to the items in the current scope.

However, in Chapel, the use statement is also overloaded if you think about it this way: (1) private use will import/make visible those symbols to the current scope and (2) public use will additionally pseudo-re-export the items in the module to clients via the "global namespace", but not symlink those items from the use location. (This second case might be a mischaracterization, so feel free to better describe it from this context.)

Here's more complicated example with Rust. (It does not compile on purpose.)

As an aside, that Rust proposal has an export statement that would behave as import+export equivalent to Rust's current behavior for use statement. Brad suggested forwarding could provide the re-export behavior, though now I'm thinking whether Chapel's public use behavior is actually that useful. It seems like it could be strengthened to match Rust's re-export/symlink behavior without significant downsides to how it currently behaves.

From the question in the OP, if you follow my rationale and agree with it, I think the answer is No.

mppf commented 5 years ago

@BryantLam - in Rust, if there is a submodule, is it by default "exported" to users of the containing module? How does a user "export" symbols from the submodule?

BryantLam commented 5 years ago

If I interpret your questions correctly, pub submodules and pub submodule items are available/visible to the parent module and "exported" via the same symbol to the clients of the parent. (code)

A user can always reference any pub-visible item through a pub "chain". Otherwise, a library developer can also just pub use the submodule to "symlink" it. (Best analogy I can think of.) (code)

BryantLam commented 5 years ago

From https://github.com/chapel-lang/chapel/issues/13524#issuecomment-514383895:

Would we allow submodules defined in this way to override their privacy by defining themselves as public module L { }?

At least until we have a better strategy, I see no reason not to allow users to do this (and to use the strategy to make the module private or public). Syntactically and conceptually it is reasonable (by analogy to submodules that aren't in separate files). It just looks a little strange (since the public/private refers to M but appears in L.chpl).

The code from https://github.com/chapel-lang/chapel/issues/13524#issuecomment-514435196 shows why this issue doesn't appear in Rust. The file-based modules in Rust are all file-scoped modules (no outer-wrapped module braces), and the declaration of the module (mod Child) is in the parent module's definition. The parent module is able to control the visibility of the child modules through the declarations without having to modify the child definitions.

IMO, I don't really like that model because it leads to unnecessary duplication since the directory structure should already tell you which child modules exist, but it is notable that because the privacy/visibility specifier is placed in the parent does make it easier to see/modify the visibility later. (I think this was another Rust proposal at one point: get rid of the need to declare mod in the parent. This, too, was not adopted.)

bradcray commented 5 years ago

@lydia-duncan: Could you add a code example to the OP to make the question more concrete? I can imagine various placements of the private module declaration and the public use statement w.r.t. each other and their modules and files but am not sure what configuration(s) you're wanting to consider here.

mppf commented 5 years ago

@bradcray - here is the example I was thinking about, anyway. I'll leave it to @lydia-duncan to update the issue description if she agrees.

M.chpl

module M {
  private module L {
    proc lFunction() { }
  }
  public use L;
}

Program.chpl

use M;
lFunction();

I think the case with just module L (vs private module L) is probably more likely to come up in practice but I'm not so sure what we are expecting we will do regarding the default visibility within a module (i.e. is the default private? or public?) and that complicates the question.

lydia-duncan commented 5 years ago

Let's look at the private module declarations in the standard and internal modules today.

In FileSystem, we have a private module GlobWrappers. The symbols defined within it are not documented - they are intended to be used by the glob function in FileSystem. The module itself is only used at the function level, though there are enough of them that we could transition to a module-scoped private use of the module. It doesn't seem like the intention is to use the symbols in GlobWrappers from anywhere other than within the FileSystem module.

In BitOps, we have a private module BitOps_internal. BitOps_internal is never used, but its symbols are explicitly named (see the definition of BitOps.rotl, for instance). The module's comment says that its purpose is to hide the extern procedures.

In ChapelSyncvar, we have two private modules, SyncVarRuntimeSupport and AlignedTSupport. These modules are currently just used (no public or private prefix). But because the modules are private, no symbol defined outside of ChapelSyncvar can utilize their symbols - we just don't go into the scope if we're outside of it. They provide internal support functions that would otherwise pollute the user's namespace. I explicitly chose not to add private to the use because the module itself was private. (I don't think anyone is arguing that a use without an explicit public or private prefix should alter the privacy of the module being used, to be clear)

I wouldn't make the argument that any of these private modules should have their symbols available outside of their parent module's scope. Additionally, their contents won't be documented, so the user won't know they exist.

However, if we choose to make an undecorated module declaration private by default, we will need to be able to override the default behavior. I think that altering the unspecified default is the furthest I would want to go.

Personally, I can't think of any real world situation where I would want to declare a module as explicitly private and then make symbols within it available more broadly. It seems like if I wanted to do that, I would have put the symbol in a public place to begin with. I've argued myself further and further from wanting to support this sort of privacy overriding, the more I've thought about it.

lydia-duncan commented 5 years ago

I've argued myself further and further from wanting to support this sort of privacy overriding, the more I've thought about it.

To me, there's a bit of specificity going on. use statements can contain references to more than just one symbols (be it module or enum). That makes it more of a "general rule" to my mind. A module declaration, however, only defines one module (though modules can contain multiple symbols). To me, that makes it more of a "specific exception", that should take precedence over the general rule. Additionally, a module can be used multiple times, with multiple privacies, but it can only get declared once. To me, this strongly implies that the declared privacy should take precedence.

bradcray commented 5 years ago

In the example Michael proposed:

M.chpl

module M {
  private module L {
    proc lFunction() { }
  }
  public use L;
}

Program.chpl

use M;
lFunction();

I think my intuition is that this would be legal. Specifically, the public use L; can see L by virtue of the fact that it's in the same module scope, so can definitely refer to it in spite of it being private. And having made it a public use, it's chosen to make (some or all of) its contents available to the world (here, "all"). My intuition is that Program.chpl would not be able to refer to the L symbol itself, though (although that depends a bit on ongoing discussions about whether use only makes the symbols within a module available, or also the module's symbol itself).

Whether or not this is a good or reasonable thing to do (e.g., why not just make L public then?) is another question. I think I could imagine that if L had 100 routines in it and you only wanted to make one of them public, this would be a way to avoid labeling the other 99 as private and then you could public use L only foo; for the one you wanted to make available. But a counterargument would perhaps be to put foo outside of L in that case.

If we chose not to follow this interpretation for some reason, I think the code above should generate an error ("you've written a contradiction here, so you probably mean to fix one of these two things") rather than silently being a no-op (which is what I think is happening today?).

I should add that I haven't read the rest of the comments on this issue yet, because I started with the OP but then wasn't sure what example we were considering of the cases that I could imagine the question asking about. So the above is my opinion from a vacuum.

mppf commented 5 years ago

Personally, I can't think of any real world situation where I would want to declare a module as explicitly private and then make symbols within it available more broadly.

The real world situation that I expect this to come up is to handle creating functions with an opaque implementation. It might be considered a style thing, but some developers might prefer to mix the code performing the implementation along with the implementation.

E.g. suppose one is making a mason package for matrix math. The mason package needs to have a global top-level module and here we'll suppose that is MyMatrixLibrary. The matrix package might be implemented with one or more helper modules. But, the existence of these modules is an implementation detail - the author of the matrix package doesn't want them to be part of the API.

Now suppose some functions that are part of the public API are using quite a few internal functions. Some of these implementation functions might even be private to the implementation module. For this reason, some developers might prefer to implement the public functions inside of the implementaiton module:

module MyMatrixLibrary {
  private module MyMatrixImplementation {
    private class InternalType { ... }
    private proc internalFunction() { ... }

    // this is the public API function
    proc matrixMultiply() { ...; internalFunction(); ... }
  }
}

Now use MyMatrixLibrary won't allow one to call matrixMultiply (because it is in a private module). How to fix this?

One way is to add to MyMatrixLibrary a declaration like proc matrixMultiply() { MyMatrixImplementation.matrixMultiply(); } which "forwards" to the implementation:

module MyMatrixLibrary {
  private module MyMatrixImplementation {
    private class InternalType { ... }
    private proc internalFunction() { ... }

    // this is the public API function
    proc matrixMultiply() { ...; internalFunction(); ... }
  }
  proc matrixMultply() { MyMatrixLibrary.matrixMultiply(); }
}

But, that can get old fast if there are a lot of functions.

Another way is to support use to do this "forwarding" for you, which is what this issue is proposing.

module MyMatrixLibrary {
  private module MyMatrixImplementation {
    private class InternalType { ... }
    private proc internalFunction() { ... }

    // this is the public API function
    proc matrixMultiply() { ...; internalFunction(); ... }
  }
  use MyMatrixImplementation only matrixMultiply;
}
lydia-duncan commented 5 years ago

If we chose not to follow this interpretation for some reason, I think the code above should generate an error ("you've written a contradiction here, so you probably mean to fix one of these two things") rather than silently being a no-op (which is what I think is happening today?).

If there's no other lFunction in scope, you will get a resolution error, though I believe it won't say that there was a version in a private module since we don't even look in there. We could add additional logic to look for versions in private modules in the case where an error occurred, since compilation is ending anyways.

lydia-duncan commented 5 years ago

For Michael's response, I might argue that in that case it would be more appropriate to leave the MyMatrixImplementation public but undocumented, especially if the implementation details are already marked private - that may be on shakier ground if we anticipate there being no replacement for pragma "no doc" when every symbol can be marked private, though, so I wouldn't consider it a strong argument by any means.

bradcray commented 5 years ago

If we chose not to follow this interpretation for some reason, I think the code above should generate an error ("you've written a contradiction here, so you probably mean to fix one of these two things") rather than silently being a no-op (which is what I think is happening today?).

If there's no other lFunction in scope, you will get a resolution error, though I believe it won't say that there was a version in a private module since we don't even look in there. We could add additional logic to look for versions in private modules in the case where an error occurred, since compilation is ending anyways.

@lydia-duncan: I think the stance you're taking is that if a sub-module is declared private then a public use of it doesn't (shouldn't) actually make anything visible (i.e., it doesn't suddenly make the symbols within the submodule available), which seems to mean that it doesn't really have any effect. In that interpretation, it seems to me that typing public use myPrivateSubmodule can only be an error / misunderstanding on the user's part, and should therefore generate an error message as a result rather than silently doing nothing. If we were unable to resolve this issue's question, then taking the "generate an error" approach would be the safe/conservative thing to do because (a) I think you've argued that we're not actually relying on any public uses of private submodules today (not that that's a proof that we wouldn't want to do so in the future), so we wouldn't break anything; and (b) if users want to write this pattern (as in Michael's example, say) then we'll get complaints / requests from them and be able to add support for it later without breaking existing (working) code.

That said, I'm not trying to railroad us in that direction necessarily. I think it's consistent with our existing rules (and proposed rules) to have a public use of a private module declaration as in Michael's examples above (FWIW, I think this is because I view private module M as meaning "M can't be referred to outside of this module" whereas I think Lydia's comments are espousing more of a blanket "nothing within M can be referred to outside of its parent module."). I'm not as worried about the seeming contradiction between public and private as Lydia is because both of the statements in question (the submodule declaration and its use) are within a given module, so presumably the module's author(s) chose to structure their code in this way for some reason.

That said, I could also imagine living with an error for a time because I don't currently feel I have pressing needs for the public use of a private submodule. But I think we should avoid continuing to be in the current state in which it is neither an error nor does it make the private module's contents available.

lydia-duncan commented 5 years ago

it seems to me that typing public use myPrivateSubmodule can only be an error / misunderstanding on the user's part, and should therefore generate an error message as a result rather than silently doing nothing. If we were unable to resolve this issue's question, then taking the "generate an error" approach would be the safe/conservative thing to do because (a) I think you've argued that we're not actually relying on any public uses of private submodules today (not that that's a proof that we wouldn't want to do so in the future), so we wouldn't break anything; and (b) if users want to write this pattern (as in Michael's example, say) then we'll get complaints / requests from them and be able to add support for it later without breaking existing (working) code.

I think that's reasonable. I would argue for not making it an error for a default use (use myPrivateSubmodule), even though the two are today equivalent.

bradcray commented 5 years ago

I would argue for not making it an error for a default use (use myPrivateSubmodule), even though the two are today equivalent.

So I think you're proposing that use M; should mean public use M; if M is a public module and private use M; if M is a private module? I hadn't thought of that, but it seems reasonable to me at a glance.

lydia-duncan commented 5 years ago

I hadn't been thinking about it that way, but I can see that being a reasonable interpretation. I had been specifically worrying about what would happen if/when we change the default behavior of a use to be private instead of public and was hoping to avoid penalizing users for relying on the default behavior and having it change under them.

What you've described seems less surprising to users and like a less breaking change, though it does shift the onus from use statements to the module declaration (which isn't a bad thing, e.g. it seems like we should instead be thinking about the default privacy declaration of a module rather than of a use, which is in keeping with my thoughts in this comment). I think it even doesn't require compiler changes to accomplish beyond ensuring the error only occurs for explicitly public uses.

lydia-duncan commented 5 years ago

I realized that we didn't actually have an issue open about the default privacy setting, so I opened #13801

mppf commented 5 years ago

See also #13523 which included some discussion of enabling something like this:

module MyProj {
  private import GNUPlot;
  public use GNUPlot;
}

in the event that MyProj wants to provide all of the symbols in GNUPlot without indicating that it is using GNUPlot to do so.

See https://github.com/chapel-lang/chapel/issues/13523#issuecomment-522968176 and https://github.com/chapel-lang/chapel/issues/13523#issuecomment-526002376

mppf commented 5 years ago

I think #13979 is probably a better place to discuss this pattern. Arguably this issue proposes one specific solution.

mppf commented 4 years ago

See https://github.com/chapel-lang/chapel/issues/13524#issuecomment-582600252 - I currently think that the answer should be "no".

bradcray commented 4 years ago

My current thinking here is that it should be possible to publicly or privately use or import a public module; and that it should be possible to publicly use a public module; but not to publicly use a private module.