Closed ben-albrecht closed 4 years ago
I was wondering if one could always use --include-subpackage
instead of --package-private-M
. The main issue there is that if a module wanted to work with 2 subdirectories, it'd want to add them to its paths (without adding a subpackage as well).
I don't really like the flag name --package-private-M
and would propose some alternative names for it:
--include
--include-subdir
--private-include
This proposal strikes me as being pretty complex in its use of a very delicate set of ordered flags to define a hierarchy. And I have to admit that I tend to get pretty skittish about proposals that start to require the language or compiler to start to have some sort of concept of a (mason) package (is there precedent for this in other languages?). To me, from the language/compiler perspective, a mason package is just a module that uses whatever modules it needs to and defines whatever submodules it wants to, so I worry about the need to teach the compiler about packages if it's not necessary.
I know I've gotten pushback on this before, but are we sure that such module-specific dependences shouldn't be specified in require
statements instead so that a given module can specify what it relies upon locally in a way that naturally maps to the module hierarchy rather than requiring that information to be pushed onto the compiler's command line, disassociating it from the module in question?
@bradcray -
This proposal strikes me as being pretty complex in its use of a very delicate set of ordered flags to define a hierarchy.
Did you see that Example 1 and Example 2 did not use any new flags at all? (I.e. I think the solution to Example 1 and Example 2 is the main part of the proposal; the new flags add additional functionality and if they're the only thing you don't like then it's worth separating that). In particular I'd like to know if you object somehow to Example 1 or Example 2.
Besides that, there is nothing about this proposal that ties it directly to mason. The names of the flags use the term "package" to reflect their expected common use. @ben-albrecht and I discussed another option for the flag names, --begin-group
and --end-group
but it seemed worse to have the flag be super abstract than for it to appear to be connected to mason when it is not.
are we sure that such module-specific dependences shouldn't be specified in require statements instead so that a given module can specify what it relies upon locally in a way that naturally maps to the module hierarchy rather than requiring that information to be pushed onto the compiler's command line
That might be possible but I don't think it would solve the problem for Mason packages using modules with the same name by itself. Something about how the compiler handles the module paths will have to change to solve that problem. Or we could conceivably insist that Mason packages using more than one .chpl file use a require statement from the main package .chpl file.
In any case I think there is more to your require
statement idea that I don't know about (I.e. to my understanding, require
only applies to C-level include paths and libraries). Are you imagining that by "require"ing a directory, you could add that to the module search path just for the current module? That seems to be a bit fraught to me (require "-Msubdir/"
would suddenly behave differently from the command line flag -M
normally does; require "subdir/"
would apply only to Chapel paths and locally to the module but I think of require
as being mainly for C dependencies and global to the program being built).
Anyway, I view the main idea of this proposal to be this:
This issue could be solved by introducing a concept of local module search paths, i.e. each module contains its own module search path rather than using a single global module search path for all modules.
Wouldn't having require
statements that are module-local also rely on the same idea? Isn't it just a matter of having an alternative to the command line to specify it? (In which case we should talk about the relative merits of specifying paths on the command line vs in source code).
Anyway, I view the main idea of this proposal to be this:
This issue could be solved by introducing a concept of local module search paths, i.e. each module contains its own module search path rather than using a single global module search path for all modules.
I don't think I'm objecting to this aspect of the proposal; more to the use of command-line flags to specify per-module behavior; and somewhat to the elevation of packages to a compiler-/language-level concept if it can be avoided (not using the term "package" in the flags seems like a dodge... is there no way we can relate these concepts to modules directly?).
to my understanding, require only applies to C-level include paths and libraries
Today, require
statements also permit the specification of .chpl files. For example:
testit.chpl:
require "M.chpl";
proc main() {
M.foo();
}
M.chpl:
proc foo() {
writeln("In M.foo()");
}
Works if you do:
$ chpl testit.chpl
I don't think we've ever had support for -M
path require
ments, though the compiler doesn't seem to complain about them; but it doesn't seem to do anything useful with them either that I can tell. However, you can use relative paths to refer to modules as well. For example, the require statement in testit.chpl above could be written:
require "subdir/M.chpl";
I think we could also consider adding support for require
-ing -M
paths (and making those module-private / local) if that was considered desirable. The pushback I referred to above is that I think there has generally been concern among some about specifying too much about files/directories in source code (e.g., some expressed concerns about the original support for -I
, -L
, -l
options, and over in a discussion about FFTW, there's a proposal to stop require
-ing the library name in the sources as I understand it). And if the relative paths of the previous idea were sufficient, that's definitely easier (in that it's already implemented).
I also wonder sometimes about specifying paths using config params, though this is harder for module paths than -I / -L paths because the current compiler architecture wants/needs to know those long before param resolution has occurred (but perhaps if it were restricted to simple string literals and compile-line config params?
What the require
example above doesn't do is make module M
in any way invisible to other modules: it's like putting subdir/M.chpl
on the command-line where M
will be parsed into the program-level scope of modules, such that M
will be visible to everyone. To have it be hidden, it would somehow need to be considered a submodule of another module (which takes my mind in the direction of include
statements—"i.e., literally stick this module's definition into the current scope"—though I know those have generally been panned as a solution for this kind of problem).
I could imagine potentially needing to make other flavors of require
statements module-private as well. For example, maybe your module relies on a helper.h
file, and so does mine, but they are two different header files that are found along different include paths. So this need for different modules to access files with the same names in different places without affecting other modules seems as though it may be more general than being just for Chapel code (?). (That said, it can obviously only go so far. E.g., if your C header defines a foo()
function and so does mine, I don't think there's much we can do to help avoid them conflicting in the C linker... And arguably we could address this by having the user do the C-level compilations outside of Chapel and just add requirements for .o
files or libraries?)
Wouldn't having require statements that are module-local also rely on the same idea? Isn't it just a matter of having an alternative to the command line to specify it? (In which case we should talk about the relative merits of specifying paths on the command line vs in source code).
I think that's right. The kernel of what I like about thinking about this in the context of require
statements is that it puts the module's dependences and requirements close to the source code that is generating those dependences and making them part of the programmatic hierarchy (modules contain modules which contain modules, each of which may have some requirements). Whereas trying to mimic that hierarchy on the command-line seems a bit fraught and fragile to me, not to mention verbose, and disassociated with the source code.
I could imagine potentially needing to make other flavors of require statements module-private as well.
We could think about making all require
statements private by default (except as I said, we can't do this for C things). We could also consider supporting private require
.
Anyway, I view the main idea of this proposal to be this:
This issue could be solved by introducing a concept of local module search paths, i.e. each module contains its own module search path rather than using a single global module search path for all modules.
I don't think I'm objecting to this aspect of the proposal;
That's good to know. I think we could immediately implement the change to fix Example 1 and Example 2.
more to the use of command-line flags to specify per-module behavior;
I think we should talk more about command-line vs. source code for these things, but my position is that we'll ultimately want to support both.
(not using the term "package" in the flags seems like a dodge... is there no way we can relate these concepts to modules directly?).
Sure, the flags could be called --include-module
and --include-submodule
.
Anyway, let's talk more (maybe in another issue) about how you'd imagine supporting submodules in a different file from a module. Such a functionality will be important in the event that the submodule wishes to refer to private functions in a parent module (since just making them both top-level modules will not allow access to the private functions).
After discussing offline with Michael: I remain unconvinced that this feature is a necessity in Chapel, though I'll admit I'm not certain about that. The direction I'd prefer to invest in for the short-term is to explore the ability to break a module and its submodules up across multiple files, and then come back to this issue.
Specifically, for example 1, it seems to me that if L is intended to be a module that helps M define its behavior but that nobody else should know about, that L should be a sub-module of M rather than a top-level module that somehow only M knows about. Or, put another way, I don't think there should be a way to inject module names into the top-level namespace that some modules can see but others can't (any more than I think there should be a way to declare a module-scope variable that some functions can see but other functions cannot).
So to me, the question example 1 poses is "Did the author actually want L to be a sub-module of M?" and if so "Is the issue really that they want a way to split module M and its submodules across multiple files to avoid having to define L within the M.chpl file?"
Or, put another way, I don't think there should be a way to inject module names into the top-level namespace that some modules can see but others can't (any more than I think there should be a way to declare a module-scope variable that some functions can see but other functions cannot).
@bradcray - Earlier you stated you were not opposed to the local module search paths part of the proposal, but this reads like you do now. Have I interpreted your current stance correctly?
So to me, the question example 1 poses is "Did the author actually want L to be a sub-module of M?" and if so "Is the issue really that they want a way to split module M and its submodules across multiple files to avoid having to define L within the M.chpl file?"
Yes, the author could make L a sub-module of M if we had a solution to splitting sub-modules across multiple files, as you suggest.
However, there are some challenges with sub-module approach, e.g when a project has a diamond-shaped dependency:
main/
main-module.chpl # Uses M
M/
src/
M.chpl # Uses L & K
L.chpl # Uses Utils
K.chpl # Uses Utils
Utils.chpl # Is this a submodule of L and K?
In any case, I think a good next step would be to explore the separated-submodule idea a bit more in a new issue and understand how we might handle some of the challenging cases, so that we have more concrete ideas to compare against each other - as @mppf mentions above.
Earlier you stated you were not opposed to the local module search paths part of the proposal, but this reads like you do now. Have I interpreted your current stance correctly?
I guess that's accurate and that the off-line discussion made me more skeptical about their importance than I had been previously. It might be most accurate to say that I'd like to see whether supporting the ability to break nested modules across multiple files + minor tweaks to directory/file organizations and conventions would obviate the need to support module-specific search paths.
I don't see your diamond-shape case as presenting a problem for sub-modules. I think the module structure you're saying you want is:
module M {
private use L, K;
private module L {
private use Utils;
}
private module K {
private use Utils;
}
private module Utils {
}
}
And then the question becomes "How would we permit you to break this structure up across multiple files?"
Here's one answer to my question (albeit one that's generally been met with negative reviews, but just to start somewhere...):
main/
main-module.chpl # Uses M
M/
src/
M.chpl # wants sub-modules L, K, Util
subdir/
L.chpl
K.chpl
Util.chpl
M.chpl:
module M {
private use L, K;
include "subdir/L.chpl", "subdir/K.chpl", "subdir/Util.chpl";
}
(where L.chpl, K.chpl, and Util.chpl each define the respective module from my previous comment).
Properties:
M
will be able to access them (accidentally or purposefully)M/src/subdir
which is never added to the compiler's module search path, so other use
s of L, K, and Util` will never accidentally find them.I think that some of the criticism of include
has to do with its similarity to the C preprocessor. And yet perhaps all we need is "I'd like to write a submodule in a different file".
I'm not sure I'm on board with the requirement that the subdir
exist in this situation. (Two ways to remove that requirement - first, the local module search path strategy; second, use a different filename extension for snippets of Chapel code to be included). However I agree that it would resolve the duplicate module problem if the strategy were followed.
Additionally the need to put a path like subdir/L.chpl
in the source code will run into the concern you mentioned above:
generally been concern among some about specifying too much about files/directories in source code
So here is a straw-person counter-proposal:
main/
main-module.chpl # Uses M
M/
src/
M.chpl
L.chpl // intended to be private
module M {
private module L;
}
Here the compiler could interpret module L;
as "Please find L.chpl
in the local module search path and include its contents here". I would expect that the compiler would allow (but not require) L.chpl
to wrap all of its code in a module L { }
declaration.
I'm not sure I'm on board with the requirement that the subdir exist in this situation.
Sorry, I didn't mean to imply that subdir
had to exist. I think you could just as easily write include "L.chpl", "K.chpl", "Util.chpl";
after moving them up a directory level and not involve subdir
at all. I wouldn't expect an include
of a file to add that file's directory to the global module search path any more than subdir
was in my example (so ./
wouldn't be either). In writing the example, I was imagining that M/src/
might already be in the global module search path which is why I pushed them down a level. More on that just below.
So, starting with your preferred directory structure:
main/
main-module.chpl # Uses M
M/
src/
M.chpl
L.chpl // intended to be private
I'm thinking about how main-module.chpl found M.chpl to begin with? (where the third answer below is what I think this issue is assuming, but for completeness...).
One possibility is that it's in the module search path. But if that's the case, then L.chpl is also in the module search path suggesting that any other module's use L
is also going to find it. That is, there's nothing about L.chpl that's private if we store it in a directory that's in the module search path. And if it is, adding a module-local search path wouldn't do anything to hide it any better.
A second possibility is that main-module
had require "M/src/M.chpl";
in its source. Taking this approach keeps M/src/
out of the global module search path and would prevent any other modules from finding L unless they also knew to name M/src/
in some way. So taking this approach doesn't require a local-module search path to keep L.chpl hidden because it already is. That said, this approach seems unlikely to be attractive because your main-module probably doesn't want to embed M's location in its source code.
A third possibility is that M/src/M.chpl
was named on the command-line (which is the approach implied by the issue description above). Today this does add M/src/
to the global module search path, but perhaps it shouldn't. Most require
statements are designed to behave similarly to adding their contents to the command-line, so this is arguably inconsistent with how the previous case was handled (or vice-versa). I believe we decided to add this feature as a convenience so that M
could more easily use
sibling modules in the same directory structure (see two paragraphs below for additional info). But perhaps we've actually created an inconvenience since it doesn't provide a way to name specific files without also adding their directories to the search path (and if you had really wanted to do that, perhaps you should've just specified the -M
flag in which case you presumably wouldn't have had to name the specific file for a case like this anyway).
So this makes me think that we should look into no longer having command-line Chapel files affect the global module search path, see what tests break, and whether we find them compelling. If not, we can change this behavior to not affect the global module search path, and not require a local module search path either (at least for this case/reason). This is a simple change to make (see https://github.com/bradcray/chapel/tree/relative-chpl-dont-affect-modpath) and it looks like < 75 tests use the feature, so I'll run a spot-check on them tonight and do a full run to make sure I didn't miss anything when nightly testing isn't about to run (failures due to spot-check: modules/bradc/printModStuff/foo.chpl
, modules/bradc/srcDirImpliesPath/foo.chpl
, modules/sungeun/ambiguous/ambiguous2.chpl
, studies/hpcc/FFT/fft-testPow4.chpl
... I'll need to look into whether I think these are motivating or not another day).
(For historical purposes: Why did we take this behavior? I think it's because if a file a/b/c.c
is specified to a C compiler then any #include
s within that file are searched for from a/b
so we thought we were being symmetric. But this arguably makes more sense for a require
or include
statement which names a file than it does for a use
statement that names a language-level identifier...)
Anyway, if we were to change this, then I think we wouldn't need a module-local search path for this case either (and at this point I want to foreshadow an important sidebar that makes up the final three paragraphs of this comment).
Am I missing any other ways that main-module could know about M's location?
Additionally the need to put a path like subdir/L.chpl in the source code will run into the concern you mentioned above:
generally been concern among some about specifying too much about files/directories in source code
Just to be clear, I don't share this concern, at least for cases like this. I think it's reasonable for an author of a big Chapel module who wants to break it into separate files to organize those files using subdirectories and specify relative paths to get to the files where they live. I'm also not sure that those who have objected to putting paths into sources in the past would object to cases like this either; what I recall hearing objections to was more around putting library search paths or include paths into sources for system-wide packages. But maybe there is a reason to avoid even simple relative paths like this when creating little code clusters that I'm not seeing.
I think that some of the criticism of include has to do with its similarity to the C preprocessor.
It's also similar to LaTeX's \input
feature, which I find invaluable (I can't imagine having to put the entire Chapel language specification into a single file... Why would we require Chapel programmers writing huge module structures to do the same?)
The main criticism I've heard about include
is that while it might be useful, it's not sufficient for everything users want when breaking things across multiple files because they want some sort of separate compilation ability and include
just creates a way to give the compiler more source at once rather than some sort of pre-compiled thing. I think this feature request is a reasonable one, but I don't think it means that include
isn't useful/valuable in and of itself. Particularly given that we don't have separate compilation yet; and once we did, presumably there'd be a way to say "include
or input
this precompiled module as a submodule to my current scope" as an alternative to "this uncompiled source code."
I don't mean to imply that having an include
/ input
statement is the only way to solve the nested modules in different files problem, but it's a familiar one and doesn't seem inherently problematic to me. It can be abused of course, but most things can if you push them hard enough; and I think there are plenty of clean uses and preferred styles that make sense (e.g., included files should define entire modules, functions, or variables, not parts of lines that will be glommed together with parts of other lines
). For example, in LaTeX I could put arbitrary text into each file, but I don't... I usually map each section or figure to a file by convention which is helpful to me and clean to understand.
All that said, I'm far more happy to wrestle with counterproposals to the "how do we break a module across multiple files" question than the "how do we create module-specific search directories" question because I think it solves two problems: (1) how to avoid huge monolithic files in Chapel and (2) how to encapsulate private modules so that they don't pollute the top-level program namespace.
That said, I have to admit that I'm not crazy about Michael's counterproposal:
module M {
private module L;
}
As Chapel stands today, I interpret this as: "I'm defining a private module named L. It has no body / contents" (similar to how extern proc foo();
has no body). Nothing about this statement (as compared to the current form private module L { ... }
suggests to me "look around the file system for something that defines a module named L and inject its contents here." To me, it would be surprising if such a concept did not name a file.
[One historical note that I've brushed up against a few times in this issue and want to get out in the open again: The current behavior in which use L;
causes the compiler to go look for files named L.chpl
was considered a poor hack the day it was introduced, and isn't something that I think we should particularly cling to or emulate. The original intention which we never had time to implement was to traverse the module search path looking for files that define module L
regardless of the file's name. For instance, L-1.1.chpl
or MyFilename.chpl
would be parsed if they defined module L { ... }
. Why did we take the current approach? Because it was simple and got us running and in many cases the two things do / did match (particularly when using implicit module names).
We made a start at doing something better with a grammar called modulefinder.ypp (that can be found in the git archives) which was meant to be a clone of chapel.ypp that mostly just dropped code on the ground but knew how to navigate comments and strings to avoid false positives. Then the idea was to create little index files that would say which modules were defined by each .chpl
file and to use those indices to resolve use
statements rather than leaning on the assumption that the filename had to be the same.
I think this model still has merit (lots more than the current system), though there are challenges as well: For example, if the modules that are defined by a file depend on the settings of a config param
then the index files couldn't simply be updated based on the timestamps of the .chpl files, but rather would have to be sensitive to the specifics of the compilation; so perhaps rather than storing index files, the compiler should just have an ultra-fast way to find candidate .chpl files (via grep?), parse them, and see whether they defined the module or not (dropping the code on the ground if the answer was "not" and searching onward...)]
The main criticism I've heard about
include
is that
I'd forgotten that Bryant also gave it a thumbs-down for other reasons in issue #10909.
Am I missing any other ways that main-module could know about M's location?
Not that I know of. Indeed, the Mason case is that M/src/M.chpl
is named on the command-line. However, I personally would rather have module-local search paths than to not be able to implicitly use another module in the same directory as M.chpl
. I use this feature all the time when running tests.
For example, if the modules that are defined by a file depend on the settings of a config param then the index files couldn't simply be updated based on the timestamps of the .chpl files, but rather would have to be sensitive to the specifics of the compilation
I'm not seeing how which modules a file defines could depend on a config param currently. Are you imagining that some other feature is introduced?
A third possibility is that M/src/M.chpl was named on the command-line (which is the approach implied by the issue description above). Today this does add M/src/ to the global module search path, but perhaps it shouldn't.
Right, I think we need to choose one (or more) of these:
a. Naming M/src/M.chpl
on the command line doesn't add M/src/
to the global module search path
b. Naming M/src/M.chpl
on the command line does add M/src/
to a local module search path (within M.chpl, files in M/src can be used) but not to the global module search path.
c. We change mason to use a different (perhaps not yet available) way of communicating a package module path to the compiler. This different way would not affect the global module search path.
The original intention which we never had time to implement was to traverse the module search path looking for files that define module L regardless of the file's name.
I have in the past bristled at the way that use M
works today and I agree with you that something about it probably needs to change. However I view the feature currently missing is a way to explicitly indicate which file you want to gather a module from. Perhaps the require
syntax followed by a use
would do it. But, if we used require
that way, we'd still need some sort of "local search path". Why? Because the require Something.chpl
would say "Go find Something.chpl please and allow modules defined in it do be use
d from this module". In particular it would not say "Please make modules in Something.chpl available to use from all modules".
I think we still have a problem that requires module-local search paths. The reason for that is that if we allow a (more explicit) way to indicate where a module is coming from, then it needs to be checked before the global module search path and not apply to other modules.
I tried using require
in this way with the current compiler and here is an example and the problems I ran into:
main/
main-module.chpl # Uses M
M/
src/
M.chpl # wants to privately use L / have submodule L
subdir/
L.chpl
chpl main-module.chpl M/src/M.chpl
// main-module.chpl
use M;
proc main() {
mfunction();
use L; // currently compiles but I want it to be an error
// because L is intended to be private to the package M.
}
// M/src/M.chpl
module M {
//require "L.chpl"; // doesn't find L.chpl
//require "./L.chpl"; // doesn't find L.chpl
require "M/src/subdir/L.chpl"; // works but requires specific working directory
proc mfunction() {
use L only;
L.lfunction();
}
}
// M/src/subdir/L.chpl
module L {
writeln("initing L");
proc lfunction() {
writeln("in lfunction");
}
}
An idea of module-local search paths would solve 2 problems in this example:
use L
in main-module.chpl
an error, since L won't be findable in the global path.M.chpl
to require "L.chpl"
in a way that doesn't assume anything about the current directory of the compilation call.That said, I have to admit that I'm not crazy about Michael's counterproposal:
module M { private module L; }
As Chapel stands today, I interpret this as: "I'm defining a private module named L. It has no body / contents" (similar to how
extern proc foo();
has no body). Nothing about this statement (as compared to the current formprivate module L { ... }
suggests to me "look around the file system for something that defines a module named L and inject its contents here." To me, it would be surprising if such a concept did not name a file.
Sure, we can address that. What about
module M {
private module L in "L.chpl";
}
Anyway I think the big question is if we want submodules-in-different-files to be handled by:
From the original post, I completely agree with Examples 1 and 2. Because that would be only a behavioral change with the compiler with no new compiler options, I don't see a reason not to do it today.
For Subdirectories and Examples 3 and 4: nope. I don't want to be in perpetual servitude to an arbitrary layout of my filesystem directories. The user should not be allowed to arbitrarily put files in subdirectories only to then go about representing that layout differently in their actual code. The code should dictate where to put modules, not the other way around. That way, we don't get into this mess with require
statements, include
statements, or whatnot.
For the following code:
module MyModule {
use Submodule1;
use Submodule2;
}
There should only be a few known layouts for it, including a few combinatorial layouts between 2 and 3.
1
├── main.chpl # Uses MyModule
├── MyModule.chpl
├── Submodule1.chpl
└── Submodule2.chpl
2
├── main.chpl
└── MyModule
├── MyModule.chpl
├── Submodule1.chpl
└── Submodule2.chpl
3
├── main.chpl
└── MyModule
├── MyModule.chpl
├── Submodule1
│ └── Submodule1.chpl
└── Submodule2
└── Submodule2.chpl
All three of these are compiled with the same line:
chpl main.chpl
Local module search paths:
main.chpl:
$CHPL_HOME/modules/* # standard library
<current directory of main.chpl> # Find MyModule in 1
If MyModule not found: <current directory>/MyModule/MyModule.chpl # Find MyModule in 2 and 3
MyModule.chpl:
$CHPL_HOME/modules/* # standard library
<current directory of MyModule.chpl> # Find Submodule{1,2} in 2
If Submodule1 not found: <current directory>/Submodule1/Submodule1.chpl
If Submodule2 not found: <current directory>/Submodule2/Submodule2.chpl
Submodule1.chpl
$CHPL_HOME/modules/* # standard library
<current directory of Submodule1.chpl>
Submodule2.chpl
$CHPL_HOME/modules/* # standard library
<current directory of Submodule2.chpl>
# The if-conditionals could be considered an optimization.
#
# As an aside, Rust represents the top-level module with file `lib.rs`. The local module search
# paths listed would go to e.g., <current directory>/MyModule/lib.chpl instead of fully naming
# the already known name.
Look at the directory tree. Can you tell which are the parent modules and which are the submodules? The compiler should be able to do this too without any new compiler options.
So, Example 3 is turned into:
# Original Post
main/
main-module.chpl # Uses M
M/
src/
M.chpl # Uses L and K
L.chpl
subdir/
K.chpl
# My Proposed Layout
main/
main-module.chpl # Uses M
M/
src/
M.chpl # Uses L and K
L.chpl
K/
K.chpl
compiled with:
# Same as Example 1
chpl main/main-module.chpl M/src/M.chpl
and Example 4:
# Original Post
main/
main-module.chpl # Uses M and A
M/
src/
M.chpl # Uses L and K
L.chpl
subdir/
K.chpl
A/
src/
A.chpl # Uses B and C
subdir/
B.chpl # Uses C
subsubdir/
C.chpl
# My Proposed Layout
main/
main-module.chpl # Uses M and A
M/
src/
M.chpl # Uses L and K
L.chpl
K/
K.chpl
A/
src/
A.chpl # Uses B and C
B/
B.chpl # Uses C
C/
C.chpl
compiled with:
# Essentially same as Example 1
chpl main/main-module.chpl M/src/M.chpl A/src/A.chpl
Most of the comments in this thread talk about the Original Examples 3 and 4, which I am not in favor of supporting because of the unnecessary complexity and discussion it has generated. This solution seems way cleaner.
Reading through the comments more carefully, Brad's responses have actually been pushing back against the idea of local module search paths. Global module search paths are right in line with the current status quo where module members have default public visibility and use
statements import symbols into the global namespace. :+1: (But seriously. I hate all three of these behaviors.)
@bradc: Specifically, for example 1, it seems to me that if L is intended to be a module that helps M define its behavior but that nobody else should know about, that L should be a sub-module of M rather than a top-level module that somehow only M knows about. Or, put another way, I don't think there should be a way to inject module names into the top-level namespace that some modules can see but others can't (any more than I think there should be a way to declare a module-scope variable that some functions can see but other functions cannot).
So to me, the question example 1 poses is "Did the author actually want L to be a sub-module of M?" and if so "Is the issue really that they want a way to split module M and its submodules across multiple files to avoid having to define L within the M.chpl file?"
Example 1 is good to me. I want to expose a set of public APIs through a top-level module called M. I don't mind defining a submodule L in a separate file if it means I can break away some of those components into a logical grouping. If I can't break that module into a logical grouping called L, then they deserve to be in a monolithic file because it's all related functionality anyway.
One place where this helps is with #12712 in a refactor of all stable standard modules into top-level module std
.
@mppf: So here is a straw-person counter-proposal:
main/ main-module.chpl # Uses M M/ src/ M.chpl L.chpl // intended to be private
module M { private module L; }
Here the compiler could interpret
module L;
as "Please findL.chpl
in the local module search path and include its contents here". I would expect that the compiler would allow (but not require)L.chpl
to wrap all of its code in amodule L { }
declaration.
While just a straw-person, the problem with this approach is that it doesn't work for #12712.
@bradc: One possibility is that it's in the module search path. But if that's the case, then L.chpl is also in the module search path suggesting that any other module's
use L
is also going to find it. That is, there's nothing about L.chpl that's private if we store it in a directory that's in the module search path. And if it is, adding a module-local search path wouldn't do anything to hide it any better.
The local module search paths would need to learn how to look for files too. The problem doesn't occur if main.chpl
only knows how to find exactly M/src/M.chpl
and not look all inside M/src/
.
A second possibility is that
main-module
hadrequire "M/src/M.chpl";
in its source. Taking this approach keepsM/src/
out of the global module search path and would prevent any other modules from finding L unless they also knew to nameM/src/
in some way. So taking this approach doesn't require a local-module search path to keep L.chpl hidden because it already is. That said, this approach seems unlikely to be attractive because your main-module probably doesn't want to embed M's location in its source code.
I don't want to embed directory paths to files at all. It's unnecessary information if we outright forbid that directory structure from existing with local module search paths. More critically, semantic imports through use
statements will help incremental compilation. C++ has decades of experience with #include
statements causing all kinds of nightmares for incremental compilation; granted, Chapel has not gotten to those same problems (yet) because the language doesn't have the equivalent of #define
or #include
.
So this makes me think that we should look into no longer having command-line Chapel files affect the global module search path, see what tests break, and whether we find them compelling.
I wouldn't put too much emphasis on the test suite given that there likely aren't a lot of tests with large hierarchical module dependencies in there.
Just to be clear, I don't share this concern, at least for cases like this. I think it's reasonable for an author of a big Chapel module who wants to break it into separate files to organize those files using subdirectories and specify relative paths to get to the files where they live. I'm also not sure that those who have objected to putting paths into sources in the past would object to cases like this either; what I recall hearing objections to was more around putting library search paths or include paths into sources for system-wide packages. But maybe there is a reason to avoid even simple relative paths like this when creating little code clusters that I'm not seeing.
I dislike all of the stated behaviors. :smile: Maybe that's because I like controlling Chapel things from within Chapel source and not controlling C or command-line things from within Chapel source. The compiler can control the local module search paths without having to resort to filesystem-level constructs and the duplication of functionality when nested use
statements could make the compiler smarter.
// A_Include.chpl
module A {
include "B.chpl"
include "C.chpl"
include "D.chpl"
}
// A_Include_Bad.chpl
module A {
module B {
include "B.chpl" // Did I just wrap B in an outer B?
}
module C {
include "C.chpl"
}
module D {
include "C.chpl" // Whoops.
}
}
// A_LocalModuleSearchPaths.chpl
module A {
use B;
use C;
use D;
}
It's also similar to LaTeX's
\input
feature, which I find invaluable (I can't imagine having to put the entire Chapel language specification into a single file...
While I don't want to go up against the behemoth that is LaTeX, this is one of the first results regarding \input
versus \include
. The biggest benefit for the more restrictive \include
is incremental compilation.
... Why would we require Chapel programmers writing huge module structures to do the same?)
Because we can be better! Do you want faster horses or a car? :+1:
I think this feature request is a reasonable one, but I don't think it means that
include
isn't useful/valuable in and of itself.
I'd personally like whatever solution occurs with modules to be useful enough that include
wouldn't be needed. Again, incremental compilation is going to be problematic for include
. (Where do you create the compilation boundaries when you can potentially include arbitrary files in a directory structure?) But I can't say I fully understand all the relevant concerns especially around generics.
Particularly given that we don't have separate compilation yet; and once we did, presumably there'd be a way to say "
include
orinput
this precompiled module as a submodule to my current scope" as an alternative to "this uncompiled source code."
I'd advocate for use
statements to be some of this. Semantic imports!
I don't mean to imply that having an
include
/input
statement is the only way to solve the nested modules in different files problem, but it's a familiar one and doesn't seem inherently problematic to me.
See above regarding C++ modules after decades of experience with #include
.
It can be abused of course, but most things can if you push them hard enough; and I think there are plenty of clean uses and preferred styles that make sense (e.g.,
included files should define entire modules, functions, or variables, not parts of lines that will be glommed together with parts of other lines
). For example, in LaTeX I could put arbitrary text into each file, but I don't... I usually map each section or figure to a file by convention which is helpful to me and clean to understand.
Unless the language enforces it, someone will do something tricky with it, which is where #include
and #define
directives have been an issue. Do you like manually specifying source-file dependencies in Makefiles? I sure don't.
I'd advocate for something much more restrictive that provides the necessary information for compilers to do their jobs while giving enough flexibility to the programmer to lay out their directory structure without low-level hooks to let humans be the creative entities that they are in doing said tricky things. I'd argue that this isn't new ground; modern languages are all gravitating away from these low-level behaviors regarding package directory structures. Though taken to the extreme, something to avoid is Python's behavior where it's stat
ing all over the filesystem.
The original intention which we never had time to implement was to traverse the module search path looking for files that define module
L
regardless of the file's name. For instance,L-1.1.chpl
orMyFilename.chpl
would be parsed if they definedmodule L { ... }
. Why did we take the current approach? Because it was simple and got us running and in many cases the two things do / did match (particularly when using implicit module names).
Uh, so if I wanted to edit someone else's module L, I'd have to grep all source files to find it? I'm glad that's not the world we ended up in; restrictive is better for the compiler! (From my previous point, that also sounds pretty expensive for the filesystem.)
@mppf: I have in the past bristled at the way that
use M
works today and I agree with you that something about it probably needs to change. However I view the feature currently missing is a way to explicitly indicate which file you want to gather a module from. Perhaps therequire
syntax followed by ause
would do it. But, if we usedrequire
that way, we'd still need some sort of "local search path". Why? Because therequire Something.chpl
would say "Go find Something.chpl please and allow modules defined in it do beuse
d from this module". In particular it would not say "Please make modules in Something.chpl available to use from all modules".
I agree; use M
's behavior should change to become more powerful.
Handy references to consider:
Another example. I'll use use
but pretend that we're not worried about global namespace pollution in the active scope (i.e., import
, or advocating for use
to do more/different).
module Foo {
public use Bar.Baz;
private use Details; // Only available to Foo and any of its submodules like Bar or Bar.Baz.
}
module Bar {
public use Baz; // If we don't do this, then Foo can't `use Bar.Baz` either.
}
proc main() {
use Foo;
//use Foo.Bar; // No! You can't do this. Foo didn't `public use` Bar.
use Foo.Baz; // Ok. Foo has Baz in scope. main doesn't need to know that it occurs through Bar.
//use Foo.Bar.Baz; // Nope!
//use Foo.Details; // Can't do this either. Details is private.
}
# One notional directory structure.
root
├── Foo
│ ├── Bar
│ │ ├── Bar.chpl
│ │ └── Baz.chpl
│ ├── Details
│ │ ├── DetailsA.chpl
│ │ ├── DetailsB.chpl
│ │ ├── DetailsC.chpl
│ │ └── Details.chpl
│ └── Foo.chpl
└── main.chpl
Local module search paths:
main.chpl
<std>
# Where is Baz? Look in module Foo
<curdir>
If Foo not found: <curdir>/Foo/Foo.chpl
Foo
<std>
# Where is Baz? Look in module Bar
# Where is Details?
<curdir>
If Bar not found: <curdir>/Bar/Bar.chpl
If Details not found: <curdir>/Details/Details.chpl
Bar
<std>
# Where is Baz?
<curdir>
If Baz not found: <curdir>/Baz/Baz.chpl
Baz
<std>
# Uses Foo.DetailsA
<curdir>
<path to DetailsA> # already known?
Details
<std>
# Uses Details{A,B,C}; Where are they?
<curdir>
If DetailsA not found: <curdir>/DetailsA/DetailsA.chpl
If DetailsB not found: <curdir>/DetailsB/DetailsB.chpl
If DetailsC not found: <curdir>/DetailsC/DetailsC.chpl
But note the dragons. I'm not sure how Details will get exposed to Bar or Baz. Rust can use a top-of-level absolute path to handle this case. So maybe this isn't quite the right model and we have to be even more restrictive and start our local search elsewhere like the root
of the tree. Clearly, Rust has thought through more of these issues than we have.
Edit: Added search paths for modules under Foo looking for Details. Maybe? I don't know.
@BryantLam - wow that's a lot of comments :)
I wanted to bring up some things related to your proposal in https://github.com/chapel-lang/chapel/issues/12923#issuecomment-493811580 but please note that I'm not yet trying to express an opinion about it.
First, I want to understand how your proposal is different from what I proposed in #10946. I think that the difference is that your proposal doesn't actually create submodules. Instead, it is a module search adjustment. In particular, when looking for M.chpl
, the compiler will be willing to look for M/M.chpl
in current search paths in addition to looking for M.chpl
. Is that right?
module L;
strategy work for standard modules in std namespace?Second,
@mppf: So here is a straw-person counter-proposal:
main/ main-module.chpl # Uses M M/ src/ M.chpl L.chpl // intended to be private
module M { private module L; }
Here the compiler could interpret
module L;
as "Please findL.chpl
in the local module search path and include its contents here". I would expect that the compiler would allow (but not require)L.chpl
to wrap all of its code in amodule L { }
declaration.While just a straw-person, the problem with this approach is that it doesn't work for #12712.
I don't understand why this couldn't work for #12712. Certainly in that situation, we wouldn't want all of the submodules to be private, but I was here trying to propose that module L;
would look for L.chpl
somewhere and the private
part was entirely optional. So in particular a sketch of the standard library would be this:
module std {
module Sort;
module Random;
...
}
Keeping in mind that this is a straw-person proposal, I do have sympathy for Brad's objection to it that module SomeName;
appears to define an empty module. However I don't think it's that different from a function signature, which says that the function exists but does not define its body. (We have this pattern now for extern
functions, but it'll probably come up with interfaces, and C/C++ programmers are definitely familiar with it). Anyway my hypothesis here is that it is possible to adjust the syntax for this idea to address the objection.
Lastly, I wanted to bring up the interaction of privacy control with submodules.
Back to the first example, we have
main/
main-module.chpl # Uses M
M/
src/
M.chpl # Uses L
L.chpl
and the desire is to arrange it so that L
is private to M
. The original issue description above proposes a search-path way of doing that - where L
is not visible because it's not in the global module search path (in normal usage that names M/src/M.chpl
on the compile line, anyway).
However I do think it's also reasonable to wonder - what if M
had private functions/types/variables? Since here L
is part of the implementation of M
, what if the author of that module wanted L
to be able to access these private functions/types/variables?
If L is literally a submodule to M, it can use private things in M, because it is part of M:
module M {
private proc privateProc() { }
module L {
privateProc();
}
}
I can see two strategies here:
I think that the module L;
proposal as well as #10946 both use approach 2. I can't tell if your proposal solves this problem at all, or if you would use approach 1.
I did look at your "Here be dragons" link for Rust and didn't find anything in there that surprised me. (But maybe I don't know what to look for). However I did notice that Rust uses exactly the corresponding syntax from my straw-person proposal: https://doc.rust-lang.org/book/ch07-02-modules-and-use-to-control-scope-and-privacy.html#separating-modules-into-different-files
Using a semicolon after
mod sound
instead of a block tells Rust to load the contents of the module from another file with the same name as the module.
Additionally, I've tried to understand other elements of Rust's design here.
MyModule
can be found in MyModule.rs
or in MyModule/MyModule.rs
(which is similar to what @BryantLam proposed) -- https://doc.rust-lang.org/reference/items/modules.html#module-source-filenamesMyModule/MyModule.rs
and that declares a module in another file Impl.rs
, doesn't the Rust compiler need to know to look for it in MyModule/
and not just in the global search path? Maybe it is just that a mod Impl;
declaration always looks in the directory storing the source code for the module it is contained it? That is, the file MyModule/MyModule.rs
contains mod Impl;
and therefore it looks for Impl.rs
in MyModule/
?require "SomeDirectory/SomeFile.chpl
or to the proposed module L in "L.chpl";
.@mppf: First, I want to understand how your proposal is different from what I proposed in #10946. I think that the difference is that your proposal doesn't actually create submodules. Instead, it is a module search adjustment. In particular, when looking for
M.chpl
, the compiler will be willing to look forM/M.chpl
in current search paths in addition to looking forM.chpl
. Is that right?
I think that's right. My proposal doesn't create submodules; rather it limits the module search paths instead. After looking at #10946 more closely, I didn't grasp the distinction between a submodule versus use
ing another module until now, so I can see why my proposal and some of my examples might not be solving the core problem of representing submodules. Your comment has helped a lot to clarify my misunderstanding.
Does the
module L;
strategy work for standard modules in std namespace?I don't understand why this couldn't work for #12712. Certainly in that situation, we wouldn't want all of the submodules to be private, but I was here trying to propose that
module L;
would look forL.chpl
somewhere and theprivate
part was entirely optional.
Whoops! Sorry, you're right! I read your code snippet too quickly and thought that private
was the relevant portion when it was actually the module L;
piece. Your approach looks good to me since [sub]modules already feel like they could be a kind of extern anyway when you break them into separate files.
Privacy control and submodules
I can see two strategies here:
1. Tie module search paths to privacy rules to allow this pattern 2. Support (somehow) explicitly creating a submodule in a different file.
I think that the
module L;
proposal as well as #10946 both use approach 2. I can't tell if your proposal solves this problem at all, or if you would use approach 1.
I think my proposal--being an extension of the original post at the top--will likely not address the submodules distinction unless strategy 1 enables that effect. The proposal only handles the local search paths, so I think if we substitute module L;
for use L;
, as in your straw-person proposal, it will achieve the intended effect. #10946 might be fine too, but I feel like that approach has less control since I can't explicitly say a module is private in the supermodule (?). I'm a bit fuzzier on that approach and would want to see more if seriously considered.
What does Rust do?
More (hopefully relevant) references:
@BryantLam - Thanks especially for the links to Rust's previous work. I see also Revisiting Rust's modules, part 2 and - from a different author - The Rust module system is too confusing.
Also, it looks like what was actually agreed upon and implemented from those blog posts is in RFC 2126. As I understand it, that RFC does support the idea that a directory can represent a module - but it doesn't make it mandatory. (In particular, if you search for mod cli;
, you will see that the file cli.rs
can refer to the files in cli/
and as a result the module cli
is substantially stored in the directory cli
).
One thing that is clear to me is that just as this is one of the few Chapel issues where I see people putting 👎 on ideas... the module system discussions for Rust were pretty contentious. Somehow it seems inherent to the topic.
Anyway, it looks to me like the authors of the blog posts mentioned above would like for Rust modules to more closely map to files and directories. However AFAIK this is not what Rust has done, at least in part due to backwards compatability issues. I think that is a reasonable direction for us to go - or to at least seriously consider. Certainly one could view #10946 as a starting point in that direction. Note that some of the blog posts even argue for deprecating mod L;
type declarations - which is what we have been discussing in the straw-person proposal. These would not be necessary with the idea of a directory-as-a-module.
There is an important difference from #10946 and the Rust proposals around directory-as-module. In #10946, I proposed that the files within a directory would be submodules. But in Revisiting Rust's modules, the files in a directory collectively create a module, and submodules are stored in their own subdirectory.
Why do they think that it's generally better for the directory structure to match the module hierarchy? Because it's less confusing (especially for beginners) and also because it allows one to know where to look for a particular piece of code in a larger project.
So, I think the main question at this point is this - should the recommended style for submodules in different files involve an idea of directories representing submodules? That is, that a module can be represented by a directory, with submodules represented by subdirectories?
What could this look like, in terms of Example 1 from this issue?
M.chpl
and M
Directory Layout:
main/
main-module.chpl # Uses M
M.chpl
M/
L.chpl
Compilation of Main Module:
chpl main/main-module.chpl M.chpl
M.chpl could private use L
. The compiler would know that when compiling code in M.chpl
, it can also look in M/*.chpl
to satisfy use
. Additionally, M.chpl could have a call like L.foo()
which would be allowed in M.chpl
even without a use statement. (We get that behavior today if L.chpl is included on the command line - here the files in M/*.chpl
would be similarly treated, but only when handling code in M.chpl
and not in say main-module.chpl
). main-module.chpl
would not be able to use L
or to refer to it unless M.chpl includes public use L
. (I think we are planning to move towards private use
being the default but that's another issue). Lastly, the compiler would consider L
to be a submodule of M
for privacy / scoping purposes.
More links from Rust:
@mppf: One thing that is clear to me is that just as this is one of the few Chapel issues where I see people putting -1 on ideas... the module system discussions for Rust were pretty contentious. Somehow it seems inherent to the topic.
I wrestled with this topic myself coming from a C/C++ background, but I do believe it is better that Chapel enforces a packaging standard in the long run. The majority of programmers are reading/maintaining code way more often than writing code. I'd personally want any user of Chapel to be able to quickly learn/scan through any package source because all packages/codebases would have a consistent filesystem layout, whatever that layout may be.
Why do they think that it's generally better for the directory structure to match the module hierarchy? Because it's less confusing (especially for beginners) and also because it allows one to know where to look for a particular piece of code in a larger project.
I completely agree. This rationale deserves repeating because Python source is laid out in a similar way and I'd like to think that the spaces-vs-tabs debate went away with code formatters and style guides similar to how Python's rigid packaging hierarchy removed a similar debate for new codes being written.
There is an important difference from #10946 and the Rust proposals around directory-as-module. In #10946, I proposed that the files within a directory would be submodules. But in Revisiting Rust's modules, the files in a directory collectively create a module, and submodules are stored in their own subdirectory.
Why do they think that it's generally better for the directory structure to match the module hierarchy? Because it's less confusing (especially for beginners) and also because it allows one to know where to look for a particular piece of code in a larger project.
I agree, especially since it is easier to grok by a new user. While I do empathize with @bradc's desire for what amounts to multiple files inlined into a module { ... }
, I don't think that's the common case, or at least common enough that deviating from simply having the filesystem represent the module hierarchy.
If such a feature were desired, the proposal in #10946 actually notes inlined-files-as-module as a possibility from the original Rust proposal where files in a directory were concatenated/inlined into that module and submodules must be directories. This model would not be that hard to understand either, but it is different enough from Python and Rust that it has to be taught. I'm okay with either option since the inlining/concatenating approach affords an additional capability. It does, however, deviate from Chapel's notion of file-level modules, though that problem is also present with #10909's include statement.
More questions related to module search paths:
Ambiguities? Visibility of name conflicts between user modules and Mason packages?
For example, how would you specify between the two Foo
s?
module Baz {
...
}
module Foo {
module Bar {
module Foo {}
use Foo; // ambiguous; or (my preference) child-Foo using relative paths
// Python-like syntax.
use .Foo; // child-Foo referencing `self::Foo` module in Rust
use ..; // parent-Foo referencing `super` module in Rust
use Baz; // Today, this would work. Should it? What about other packages?
use /Baz; // Unambiguous from top. My fake syntax that starts from "root".
// .. What does "top" even mean?
}
}
Another example in https://github.com/chapel-lang/chapel/issues/10946#issuecomment-495452892.
Absolute vs. relative pathing. Is there one? Relative pathing is more natural. This would affect the ambiguous search case and Baz
.
The compilation boundary in Rust is a crate. The default visibility for items in a crate was changed from private to pub(crate)
in order to facilitate easier reuse of modules within the crate among sibling modules. Before that change, there was excessive re-exporting of items that could make the apparent module hierarchy (the filesystem layout) significantly different than the actual module hierarchy.
Chapel doesn't really have a notion for a package, but I think the question of item visibility will also be a concern in order to minimize re-exports. Python doesn't have this problem because everything is public visibility (for better or worse), so maybe it's not a big concern since Chapel already has default public behavior; the main downside to this behavior is when someone use
s a deep submodule of a package that was not intended to be exposed outside and their code later breaks because that submodule changed/was deleted (but maybe that's on them).
Question: Will Chapel will need the same pathing distinctions that Rust and Python have?
I think so - I think we'll need a way to specify the difference between an absolute path and a relative one, at the least. I think this is only about use
statements though, to be clear.
Question: What's the default search-path behavior?
I would agree that relative paths are more natural. However I'm open to considering the alternative.
Question: What's the visibility for items in a "package"?
We could introduce a visibility like package
as an alternative to public
and private
and make that default. That would amount to following Rust's rules most closely.
But either way, if there is some module M
that wants to also export M.Detail
, it would need to public use M.Detail;
(likely it would "use only" but IIRC we are thinking about changing that default). If did private use M.Detail
or just used things like M.Detail.someFunction()
(with no use
of Detail
at all), I would not expect that M.Detail
would be available to code using M
.
That leads me to wonder if it would be good enough to rely on that property to control whether or not Detail
is exported at all from M
. In that event, functions eligible for export in Detail
would be marked public
, but they wouldn't necessarily be available if Detail
were not exported or if the functions were not included in a public use
bringing in symbols in addition to the module name.
I've ignored this issue for a month because it was driving me a little crazy when I was active on it, and then the conversation snowballed to the point where I was unable to keep up (which then sapped my motivation to even try to catch up). I started into an attempt to catch up with it today and quickly felt overwhelmed again, so ended up just taking a really quick first pass through it, mostly skimming for the sake of time, and trying not to get too hung up on details. I have a feeling that what we're going to need at some point (maybe now, but more likely not quite yet) is a new issue proposing a strawperson plan that wouldn't require everyone to digest the discussion on this one in order to understand it.
As a baby step towards catching up and re-engaging on this topic, let me try and state the concern that I was left with a month ago and that's been rattling around in my head since dropping off. It seems relevant to a few of the comments that caught my eye today as I was trying to catch up, like this one:
Bryant: I didn't grasp the distinction between a submodule versus
use
ing another module until now
In doing so, I'm going to ignore (for now) a bunch of other questions that were asked of me and comments that seemed like they wanted a response to try and keep this manageable. Moreover, I'm going to do this without talking about files and directories at all because I think my concern is unrelated to that aspect of the issue (which is unfortunate, since that's the topic of the issue! :) ).
My mental model of Chapel's namespaces and scoping (which I believe matches what is implemented today), goes something like the this:
use
s are transitive (or public
) and private use
has not been implemented yet (Lydia is taking a look at that in this sprint). This may contribute to why it feels like there is a global or single (in the bad sense) namespace.use
of certain modules like IO
and Math
combined with the previous bullet also tend to make things far more porous and global-seeming than they ought to (as in issue #13118).Given that, when I think of a Chapel program, I tend to think of its structure as being formed around the nesting or hierarchy of modules. For example, given the code:
module M1 {
module S1 { ... }
module S2 { ... }
}
module M2 {
module S1 { ... }
module S2 { ... }
}
my mind pictures the following (and apologies, but I'm going to use a directory hierarchy notation for convenience, though I'm not trying to tie this back to directories and files in any way):
/
M1/
S1/
S2/
M2/
S1/
S2/
Moreover, when I think of use
statements, I tend to imagine symbolic links (ugh, file system analogies again) that point from the scope where the use
occurred to the symbol or symbols that it makes available (or *
if it's not filtered at all) permitting them to be referenced as though they were defined within that scope.
OK, so where a lot of this conversation hung me up a month ago is due to what seemed to me like a recurring theme of "I want to use a module / make it known to the Chapel compiler, but I don't want anyone else to be able to see it, yet I don't want to make it a submodule." And to my thinking, that seems inelegant and counter to the Chapel's design.
As a specific example to talk about, let's go all the way back to example 1:
main/
main-module.chpl # Uses M
M/
src/
M.chpl # Uses L
L.chpl
In my mind, regardless of the arrangement of files and directories here, the options for the module hierarchy on master today are either:
case 1: sub-module
/
main-module/
M/
L/ # L is a sub-module of M
in which case nobody can get to L
without going through M
(assuming M
lets them by making L
public)
or:
case 2: sibling module
/
main-module/
M/
L/ # L is a sibling of M
in which case we shouldn't be surprised if others can see L
because we haven't done anything to hide it from them.
What I worry about is that it felt like the original post and several of the comments have been wanting something new and different like the following:
case 3: private sibling module
/
main-module/
M/
L-but-private-to-M/ # L is a top-level module but nobody other than M is allowed to know about it
To me, this feels like a new, complicated, and unnecessary concept that I'd like to avoid if at all possible. That is, I believe that if you don't want others to know about L
, it should be a sub-module of M
; and that if you're not willing to do that, it should be OK with you for others to refer to L
when it isn't shadowed, since it's defined at the top-level. Maybe put another way, I'd like to avoid injecting a notion of permissions onto the module hierarchy such that some modules can see certain top-level modules while others cannot.
So my baby step for today is to pause at this point and see whether anyone (but particularly @mppf and @BryantLam) disagree with what I've written here (where you're welcome to point out "Yeah, we'd already come to this same conclusion midway through that huge conversation you skimmed"). Most specifically:
L
a private sub-module of M
sufficient?)My viewpoint today is that I'd be pretty happy with a system emulating some of the Rust proposals (e.g. what I outlined in https://github.com/chapel-lang/chapel/issues/12923#issuecomment-494433624 ) which provides for easy submodules but not for the private-sibling-module pattern. However I view this as having some of the same features of the original proposal in terms of customizing how the compiler "finds" modules (since e.g. M.chpl can access M/L.chpl but the rest of the source code cannot). But yes, it does so with submodules rather that private-sibling-modules.
I think that it would be reasonable for an author trying to achieve Case 3 from the original issue description to use to put L.chpl inside of M somehow. I have some concern that if doing so is not really easy / intuitive in terms of files and directories that users won't do it. Additionally, I think the question of whether or not it should be an error for Mason packages is a bit fraught. Perhaps mason should merely print out the names of the modules that are being exported.
I agree with Michael. Admittedly, this issue went off on a tangent for a bit, but it's all related to the original post's issue of conflicting same-named modules in the global module path (#8470). Reusing Example 1 from the original post (modified to include K
) and ignoring the proposed solution in https://github.com/chapel-lang/chapel/issues/12923#issuecomment-494433624 that has partially forked into a separate discussion in #10946 —
# Filesystem Structure
# Example 1B
main/
main-module.chpl # Uses M, K
M/
src/
M.chpl # Uses L
L.chpl
K/
src/
K.chpl # Uses a completely different L
L.chpl
The module hierarchy is:
# Module Structure
# case 1: sub-module
/
main-module/
M/
L/ # L is a sub-module of M
K/
L/ # This independently developed L is a sub-module of K
Today, this program cannot be compiled without conflicting-module errors.
chpl main-module.chpl -M M/src -M K/src
# error about redefinition of L
How do you solve this issue without local module search paths? One option is to do it using low-level primitives like the include
statement that you proposed (https://github.com/chapel-lang/chapel/issues/12923#issuecomment-488859307), but that is both too flexible and duplicative of use
statements if we are to take a strategic view and consider what it means to package and distribute Chapel libraries. We have capable language features (use
and/or import
) and can impose some arbitrary—but well-intentioned—restrictions to the file/directory layout (#10946) with the end goal of still requiring local module search paths, but now we would have fewer paths to actually search through.
Edit: I do agree with you. I don't think case 3 of private sibling modules is something I'm overly concerned with in the discussion regarding how to lay out code. In libraries, there will be a package-level module (i.e., Mason package) that has to be exposed as the entry module into that library, similar to a main module of an application. It's why these questions are particularly relevant regarding visibility of symbols within a package boundary.
Edit2: Part of the debate which eventually led to #10946 was how to split submodules into other files so the compiler can still find them.
Thanks for (eventually) asking the simple yes/no questions I was looking for with only minimal (5ish?) other unrelated paragraphs. I'll get back to the topic at hand soon.
I've split off the specific concrete proposal I think we might have some agreement on into #13524.
We implemented something along the lines of #13524. Closing this one.
This issue is a proposal for a solution to the duplicate module names in mason packages issue (#8470):
This issue could be solved by introducing a concept of local module search paths, i.e. each module contains its own module search path rather than using a single global module search path for all modules.
Consider the following example:
Example 1
Directory Layout:
We would like a main module in a different directory to be able to
use M
directly and notL
and we need to somehow provide the compiler with the location of M.chpl.Compilation of Main Module:
Today, the global module search path looks something like:
Therefore,
L
is still accessible to the main module.In this proposal we'd like for the local module search paths to be as follows:
Therefore, only
M
can accessL
directly.Example 2
Suppose the main-module from before now requires a mason package,
Pkg@1.0.0
:The local module search paths under this proposal would be as follows:
Subdirectories
What if there are subdirectories? To support this case, we will need new compilation flags that can modify local module search paths.
The proposed compilation flags for modifying module search paths are:
--include-package <moduleFile>
adds a module (<moduleFile>
) to the local module search path of the main module.--include-subpackage <moduleFile>
adds a module (<moduleFile
) to the local module search path of the last module listed in an--include-package
or--include-subpackage
flag.--package-private-M <path>
adds a path or module file (<path>
) to the local module search path of the last module listed in an--include-package
or--include-subpackage
flag.-M <path>
adds a path or module file (<path>
) to the local module search path of all modules being compiled, i.e. the global module search path.Example 3
Directory Layout:
Compilation Command;
The local module search paths under this proposal would be as follows:
Example 4
Directory Layout:
Compilation Command;
The local module search paths under this proposal would be as follows:
Note: The proposed flag names here are placeholders (especially
--package-private-M
) so feedback is welcome on those.