Closed kud1ing closed 10 years ago
CC @bstrie because i think i saw you chime in the complains on IRC yesterday
Crates are pretty similar to Haskell's module system. (Admittedly we don't have the tool support to use them like that easily, yet.)
They may be similar, but they are harder to use. Maybe mostly due to aspect "1"?
+1 on this topic from my side. I was also a little bit involved in the IRC dicussion yesterday after I had far more issues divding my code into multiple files and modules than in any other language I worked up to now.
The main thing that bothered me is the binding between files and modules -> each new file automatically generates a new module. When you try to divide your code into many files (e.g. one file per object as often done in C++/C#/others and even enforced in Java) you don't get files but even modules. With the additional modules you have to reimport everything and you can no longer access private fields/methods. That is often required when you have some objects that tightly interact. Imho that leads to either very large source code files or you begin to make much more things public than they should be.
As a backwards compatible improvement to this I could imagine the following:
part "otherfile.rs"
. In all subfiles you specify to which module they belong by starting with part of lib.rs
or even explicitely state the module name part of foomodule;
. I like the first one better, because it has less ambiguity.The incompatible way is to do it simply like C#, Java, AS3, ...:
Besides that I would think that crate local visibility for types would also help to overcome some of the visibility issues that creating new files->modules involves and would really like to see that feature.
If you ask me, copy the Haskell module system. It's not great but it is a reasonably working subset of what Rust aims to provide.
Please no. Haskell has many great attributes but the module system is emphatically not one of them.
As far as I'm aware Rust's module system is strictly more expressive than Haskell's. Rust decouples the unit of visibility (a module) from the unit of compilation (a crate), which I think is a smooth move. In Haskell the two concepts are tied together. You can recover a Haskell-like system in Rust by basically just never ever using submodules. Put all your definitions at the top level, and if you want to start a new module, start a new crate as well, and they will be compiled separately. So if that's what you want, you can do it today.
Please no. Haskell has many great attributes but the module system is emphatically not one of them.
Currently i have the impression that Rust's is even less practical.
As far as I'm aware Rust's module system is strictly more expressive than Haskell's. Rust decouples the unit of visibility (a module) from the unit of compilation (a crate),
What does this solve? Can you give an example of the day-to-day advantages of this? Currently i only see more code and more complexity := cost.
Put all your definitions at the top level
I don't think that we want to encourage putting unrelated code in a giant crate file. Your other solution would make Rust projects have twice the number of files than a comparable project in many other programming languages.
A couple of points:
mod
/use
thing is weird. I believe it's just an implementation detail, the parser tries to resolve paths too early. I don't see a reason why that would need to stay that way forever.foo.rs
is just mod foo { ... }
factored out. foo/bar.rs
is just mod foo { mod bar {...} }
. They have no meaning without the crate root. Granted, the docs don't really tell you and it's not obvious without forcing C++-like namespace declarations all over the place, but it is a sane system*. This is also why Rust "files" can have mutually recursive dependencies while you have to manually break them up in Haskell.*assuming you only declare mod foo;
once, and use
it everywhere else. It may or may not be a compiler bug to not check for that.
Re aspect 1 and crates: you import a crate with extern mod
, and extern mod statements are required to go before use
statements, so 1 doesn't apply (unless I'm misinterpreting it).
Re aspect 1 in general: fn foo() { bar() } fn bar() {}
is similar: you're calling a function "before" it exists, and I rarely see people complain about that. (admittedly it's not compulsory to put foo
before bar
like that, unlike use
and items).
I'm not interested in a complete overhaul of the module system at this point. It's just too late in the game.
What I am interested in is making the current system slightly more restrictive than it currently is, in order to reduce the amount of complexity that a user has to keep in their head.
I propose two restrictions:
foo
in foo::foo::foo
your current local namespace is referencing.fn foo(){} fn foo(){}
is today (error: duplicate definition of value
). If I understand correctly, this behavior would allow us to remove the completely unintuitive use-before-mod rule, as bemoaned by others. And if forbidding shadowing makes globs harder to use, then so be it. Globs are an anti-pattern, and may very well be behind a feature flag for 1.0 anyway.+1 brstrie
I'm new to rust, but I'm having trouble understanding the current module system. The use-before-mod is completely unintuitive (at-least with the current state of the documentation) to me as a new user. Maybe it will be less confusing in the future.
I also agree that Globs should not make it into 1.0, tooling can be built around the compiler to help automatically insert use/extern mod statements.
Coming from C#, the Rust module system did not work like I expected, but since Rust is not a verbose class oriented language like C#, the idea of modules becomes more important. I actually like the way a module is located in a specific file by rigid rules. In C#, one has to rely on the IDE to find the declaration of an object. What makes Rust more complicated, is having to put 'mod' and 'use' at top of each file. This becomes a burden when the hierarchy is still in flux.
My idea is to add a new "header" like file named "rust.rh" per folder which one can declare the 'mod' and 'use' for the source files. All source files are compiled as if "rust.rh" is included directly at top of the source, but it can not contain code or declaration of new modules. No explicit declaration is necessary to use "rust.rh". This means getting started is easier, one can simply create a new empty document in an existing project and start typing. The compiler must ignore circular modules pointing to the same file and to do this it needs to strip "rust.rh" for the file it is processing. For future development the "rust.rh" file can contain instructions to flatten all the files in the folder, making them part of the same module. This will make source easier to manage per folder.
My wish is to not having to type 'mod' and 'use' at all per file.
That won't work, since a folder contains many modules (as many as files are in it), and each module could be in a completely different position in the module tree, and have different needs for imports.
I agree to others that explicit imports through "use" can get easier in future as soon as there is a decent IDE. But if I would currently start porting some of code in other languages to Rust and want to preserve at least parts of the visibility restrictions I'm landing at > 10kLoc source files. This is something that really makes me nervous.
+1 to @bstrie from me too. Though I think Rule 1 could be to restrictive, the important thing is that you can always loose those restrictions in a backwards compatible way later one.
Leaving the shadowing rules like they currently are would block that path in the future.
@Kimundi , while allowing shadowing post-1.0 is theoretically backwards-compatible, it would be impossible to reintroduce use-before-mod in a backwards-compatible fashion. So we'd have to be content with either never allowing shadowing, OR allowing shadowing but not being able to enforce the perceived "order" with use-before-mod, OR breaking compatibility.
I would also like to caution everyone again against proposing drastic, sweeping changes to the module system at this stage. This is a highly complex topic with far-reaching implications. We've been through a lot of iteration to arrive at our current system, and while it could be better it's also not nearly as bad as it used to be (who else remembers export
? or .rc
files?). Further iteration upon the current system is what we must strive for.
Remember that there are only three hard problems in programming language design:
Importing functionality and namespace are two steps: mod and use. This leads to the shadowing rules, that have weird consequences: you need to import a path via use which is not even defined at this point in time, because use needs to before mod. I find this not only unfamiliar but also counterintuitive.
Yeah, we might be able to relax the ordering restrictions a bit.
The compiled file becomes a crate. In most languages you can iteratively compose a project of files/modules and use/compile those individually. Since use statements are relative to the crate, the meaning changes depending on which file you currently compile. The consequence is that it makes it more diffculta than necessary to first write file a.rs, then b.rs (uses a.rs), and then c.rs (uses b.rs) and make them compile individually. Even in Haskell that is easy. In Rust you either have to go back and forth and adjust the use-staments or only put the mod-statements in a dedicated top-level crate (lib.rs or main.rs) which defies encapsulation.
Crates do not support mutual recursion. Modules do.
If you ask me, copy the Haskell module system. It's not great but it is a reasonably working subset of what Rust aims to provide.
Totally opposed. Haskell modules do not support mutual recursion, which is a tremendous burden.
If you ask me, copy the Haskell module system.
I would be very concerned about limiting one's horizons to Haskell only. Modules are an incredibly rich area of computer science. The more I learn about how powerful the module systems are in SML and OCaml, the more I'm aware of how little I know about the language construct. From the Reddit:
when I think of a simple, powerful, and/or elegant module system, Haskell is certainly not what comes to mind. I think something more like SML's module system + SML/NJs compilation manager is a better ideal, although I like not having to write separate signature/interface files for Rust, and nesting modules in separate files is not really easily done in SML.
– @FluffySauce
I would also add, that it might be important to think about how modules play with traits a little more. SML and OCaml have amazing module systems, but don't have type classes, Haskell has poor modules, but great type classes. What happens when you put them both in the same language? Are there any precedents for that?
Here is a good page describing a bit about OCaml's modules: https://realworldocaml.org/v1/en/html/first-class-modules.html
I don't know if Rust needs that kind of power due to the fact we have type classes. Or they could be two sides of the same coin. I don't know. Just opening up the conversation.
Maybe someone should first clarify what a module is. In the Ocaml book it looks more like Classes in C++ with some encapsulated definitions to me than something like packages or namespaces, but I have no clue about ML.
But maybe we have different expectations, as I'm coming from an OO world. However as Rust wants to compete with C++ I think it would certainly be helpful if the module system is not too far off from what most developers know and expect.
Couldn't we replace the ordering rule for use
statements with just a lint which warns if imported names are shadowed by local definitions which precede them? (Or optionally also ones which come after them, perhaps as a separate lint.)
I guess what I'm getting at it, where do types, traits, and type parameters fit in with paths and use statements? The associated item syntax foo::Trait::<for T>::bar()
suggests that trait implementations form their own modules that are instances a of common interface. I've talked about the associated item syntax before, so just to be clear, I'm very much talking about the semantics here.
Sorry if I am derailing the original intent of the original post – I guess part of this is more a response to the title 'Improve the module system'. Syntax is sometimes informed by the underlying semantics, so if something's weird there, then maybe it is suggesting that we have some deeper, unresolved design questions to consider. @pcwalton and @nikomatsakis may already have ideas on this though.
Some interesting papers:
Maybe someone should first clarify what a module is. In the Ocaml book it looks more like Classes in C++ with some encapsulated definitions to me than something like packages or namespaces, but I have no clue about ML.
Yes, that is an issue. Perhaps the OP is indeed thinking of modules as more of 'namespaces' which are more for organizing large sets of items into manageable, self-documenting chunks, rather than the powerful language features displayed in languages like SML that are designed to help enable the modularization of code. We need some clarification on terminology here otherwise we will be talking past each other.
If I recall correctly @nikomatsakis and I have talked about type paremeterisation of modules. It could allow for some powerful patterns, but I think there were issues with it making things like resolve much more complex, and it might also overlap some of the functionality of traits (see the papers I posted above).
@bstrie I know, that's why I didn't talk about allowing arbitrary order of imports and module definitions here. :)
Forbidding both import and module shadowing and arbitrary ordering would be a backward compatible ruleset to tweak in the future.
As i understand it, crates are developed top-down, while many other languages support bottom-up development.
What i mean by this is, that in other languages it is not unusual to first develop types and functions/methods in independent files/modules and then bind them together in a library, executable or you don't (object file). In Rust OTOH, you always start with a crate, where you are automatically faced with the decion "library or exectuable?" upfront. Only after that you fill the crate with types and function/methods.
If this view is true, the confusion could be solved by documentation.
This is some of the latest module related research that I know of:
"ML modules provide hierarchical namespace management, as well as fine-grained control over the propagation of type information, but they do not allow modules to be broken up into mutually recursive, separately compilable components. Mixin modules facilitate recursive linking of separately compiled components, but they are not hierarchically composable and typically do not support type abstraction. We synthesize the complementary advantages of these two mechanisms in a novel module system design we call MixML. " http://www.mpi-sws.org/~rossberg/mixml/
and haskells take on MixML:
" Module systems like that of Haskell permit only a weak form of modularity in which module implementations directly depend on other implementations and must be processed in dependency order. Module systems like that of ML, on the other hand, permit a stronger form of modularity in which explicit interfaces express assumptions about dependencies, and each module can be typechecked and reasoned about independently.
In this paper, we present Backpack, a new language for building separately-typecheckable packages on top of a weak module system like Haskell's. The design of Backpack is inspired by the MixML module calculus of Rossberg and Dreyer, but differs significantly in detail. Like MixML, Backpack supports explicit interfaces and recursive linking. Unlike MixML, Backpack supports a more flexible applicative semantics of instantiation. Moreover, its design is motivated less by foundational concerns and more by the practical concern of integration into Haskell, which has led us to advocate simplicity—in both the syntax and semantics of Backpack—over raw expressive power. " http://plv.mpi-sws.org/backpack/
This still seems to be of research status but might give inspiration. At least it is the state of the art AFAIK.
I don't care one way or the other, but I am very much against any changes that breaks mutual recursion of modules. Thanks.
Also related is the Backpack extension to Haskell's module system:
Is replacing or extending our module system with one whose scope is not just namespacing but also abstraction (like ML and Backpack) seriously under consideration?
I have saying this several times:
When i write use extra::x::y::z
, the compiler always knows i'm using crate extra
, then why i still have to write extern mod extra
? It should be optional, unless compiler got ambiguity.
In general, it doesn't know that you are using a crate called extra
. What if you have a module called extra
in some crate foo
? Then
use foo::extra;
fn main() {
extra::baz(); // do I look in the `extra` crate or in `foo::extra`?
}
In any case, if we did this for all crates in some central registry, it would lead to programs accidentally compiling, e.g. you want to type foo::bar()
but accidentally write fooo::bar()
=> the compiler searches for the fooo
crate, installs it and calls the bar
function from it (imagine that fooo
and bar
both exist)... when it should definitely just be a compile error.
@huonw
use foo::extra;
fn main() {
extra::baz(); // You didn't wrote `use extra::xxx`, so don't need look in libextra. You maybe misunderstand sth.
}
When foo::bar()
is misspelled to foooo::bar()
, it's obviously compile error, because you only write use foo::*
not use foooo::*
before this line.
And what i said optional
means, it's not required in common, but in some situations rustc got ambiguity, it's required.
On Sat, Jan 25, 2014 at 01:35:43AM -0800, Gábor Lehel wrote:
Is replacing or extending our module system with one whose scope is not just namespacing but also abstraction (like ML and Backpack) seriously under consideration?
Not in the short term, I don't think.
Is replacing or extending our module system with one whose scope is not just namespacing but also abstraction (like ML and Backpack) seriously under consideration?
I'm not really advocating this right now – there is already enough to do for 1.0, but I do think it would be prudent to consider this so that we don't back ourselves into a corner in the future.
my 2c.... to me the rust module system is surprisingly counter-intuitive, having not used anything similar;
But could you simply educate users through more elaborate error messages: when you try to reference something that doesn't exist, the compiler could look in places that correspond to common errors users make .. and suggest corrective action ("did you mean ::foo::bar()" or "add 'use foo::bar' to make bar visible")
I would have guessed something more like haskells' would be easier to get into but I gather there are good reasons not to do it that way. ("use mod ...." "use mod .... as
I usually end up making a "common.rs" with a load of use's and just use common::* all over the place; as mentioned above it makes life easier when a system is in flux. Seems like one can avoid making a seperate source for that by saying "use super::*", not sure if thats a good or bad idea
At the minute i'm basically shying away from heavily using the module system because of the lack of an IDE; i still try to make symbol names unambiguous across my (so far, small) projects. With polymorphism going on you shouldn't need so many symbols. Having said that, in C++ i use nested classes alot, and even C++ people say "avoid that, make a namespace.." ... so I am begining to see the rationale behind the rust method.
I started writing multi-file Rust today, and I think the way it currently works is completely counterintuitive to newcomers. Documentation is no substitute for sane defaults.
If something doesn't change, I believe you'll end up with include!() as the preferred way of managing modules.
mod foo;
goes in every file that wants to access the stuff in foo.rsmod deeper::bar;
goes in every file that wants to access the stuff in deeper/foo.rsuse foo;
is meaningless (but to avoid breaking old code, would continue with the current meaning)use foo::frob
to avoid typing foo::frob();
(this is what people expect of use
, i.e. something that may be quite rare depending on your programming style)mod fish;
mod penguin;
mod penguin;
mod penguin::bar;
mod foo;
mod bar;
use foo::popular_function;
mod bar;
Now step back and look at that source layout again. I bet you understood exactly how my proposed modules work without having to read any documentation.
main.rs
, it is ::
; for penguin/mod.rs
it is ::penguin
.mod foo;
, it:
foo.rs
or foo/mod.rs
and add a reference to it in the current directory root modulefoo.rs
to "list of files to read sometime after I'm done with this one"use foo;
referring to the module added.use
s will put the mods in the same placeuse rootmodule::main;
(Python has problems with this)penguin.rs
or penguin/mod.rs
to exist? Instead just mod penguin::foo;
could imply it?penguin/qux.rs
exists but is not mentioned in penguin/mod.rs
, should it be possible to refer directly to it from outside of penguin/
? Currently it's not, but if the import is pub, it could be ::penguin::foo::qux
mod foo { }
) be feature-gated?from .. import foo
)? (Only really useful with deep package hierarchies)Here is one proposal:
At the moment all imports come from the root of the crate by default in order to remove ambiguities when resolving paths. How about making working relative to the current module the default?
Say we have a crate set up like:
mod baz {
mod foo {
fn bar() {}
}
}
Let's say we are working within the baz
module:
use foo::bar;
and foo::bar()
would access the item relative to the current module.use ::baz::foo::bar
and ::baz::foo::bar()
would access an item relative to the root module of the crate.use super::baz::foo::bar;
and super::baz::foo::bar()
would access an item from a higher module relative to the current module.This might make things less surprising – I still get tripped up by having to use self::
for working relative to the current crate. The disadvantage of this is that it might make 'the common case' more ugly. So I'm not sure. Food for thought anyway.
Woops, I pressed "Close and comment" by accident! Sorry!
@o11c - i'm the same in that every time i pick it up after a break, what it does is completely surprising; But I think it can be fixed with smaller changes, less disruptive to rust progress..
[1] Educating in error messages:-
[1.1] When you expected to be able to reference a symbol and its in the wrong place, the compiler looks for the closest match, and tells you what the correct path was. You just need to be reminded of the fact 'use' is crate relative, and code is mod relative. This automatic search would also be hugely useful in this early time without an IDE.
[1.2] When you started out saying "mod foo.." where you want foo like I always did, the compiler can warn you.. "warning, multiple copies of mod foo, prefer to bring modules in the crate root and 'use' them elsewhere".
[3] fixing the glob import bugs (currently use super::) doesn't work. I've been adding a 'common.rs' , and if i could just opt to carry common 'pub uses' declared in the crate root that would streamline it. (Adding the ability to "use ::" would make that easier too?). That would eliminate the need for 'common.rs'.
I'm happy with the logic of how the system actually works ... its a combination of namespacing and modules... i've not used that much in C++ voluntarily because you've got the alternative of classes-in-classes, and its an extra layer. The Rust way gives more meaning to the directory tree.
There's one more change I would suggest, I suspect it won't be popular. make pub the default. Every file needs public elements, but private is the optional one. Rust has good namespacing to hide symbols - so to me it makes sense to make everything 'pub' and then you hide with 'priv' where you need more control..(IMO)
There's one more change I would suggest, I suspect it won't be popular. make pub the default. Every file needs 'pub' . You have good namespacing to hide symbols - so to me it makes sense to make everything 'pub' and then you hide with 'priv' for more control..
I really think we should either have pub
or priv
but not both.
I suggested making pub
the default (and removing it) many months ago, but it was shot down because apparently it leads to too many things becoming public by accident. I agree that the next best thing would be to make everything private by default and remove priv
(#8122), but it's not totally obvious (yet) how to do that while keeping things flexible, consistent, and not forcing you to write pub
ten times per declaration.
@bjz The relative paths for use
thing is #10910, or at least was discussed there.
@o11c, I'm all for simplifications of the module system, but I don't actually think it's that complicated: mod
defines the modules structure, use
brings names into scope.
I believe you'll end up with include!() as the preferred way of managing modules.
There is now a significant amount of Rust code in the wild, including large multifile libraries (e.g. most of the libs in this repo, many of the projects here) and there are not many uses of include!
at all. (And of those 35 (at time of writing), most are doubled/tripled up and/or work-arounds for some macro issues (which are now resolved).)
Of your open questions, these really need to be answered:
- Anything dealing with multiple crates.
- Should there be a syntax for relative imports from a higher package (like python from .. import foo)? (Only really useful with deep package hierarchies)
(The answer to the second one is yes: the stdlib reuses things from other non-sibling modules a lot, e.g. random example of std::rt::logging
uses a variety of std::...
modules (pretty much every other module in std
does a similar thing). Note that most of these are from the root of the crate, not relative imports.)
The biggest pain IMO is splitting a module across multiple files. Right now you need reexports or include!
, the first being verbose and error-prone (without glob reexports) and the second being a hack, or just deal with introducing unnecessarily deep/wide heirarchies. Phrased another way, we currently have a way to write mods independant of files mod foo { contents }
, but not a way to write files independant of mods.
Another minor pain is that I frequently find myself saying use std; use gl; use glfw;
etc etc etc in submodules. It'd be nice if there was some sort of use that got "inherited" by submodules, so that, for example inherit use gl;
makes gl
available to all submodules. This would also make the prelude a bit less magical; it could simple be inherit use std::prelude::*;
in the crate root, rather than being injected into every module. I don't think this syntax is very good, but I think the semantics are very nice. (I think it might even be able to be a syntax extension, albiet one that transforms the entire AST starting at the module it is invoked in. Assuming they have that information.)
I'd just like to point out that @o11c's module system (at least the "Source Layout" section) is pretty much how modules work in D. That system works well and it's pretty intuitive. The filesystem is always going to be there and we'll be putting code inside files inside folders forever; might as well have that hierarchy be reflected as modules/packages and reduce confusion.
@cmr I like the "inherit" idea, does a better job than the "common.rs / use common::* everywhere" hack. To avoid a new keyword, how about making
pub use ... ; // as it is now
use ....; // inherited by submodules - same as proposed 'inherit', but it's the default.
priv use ....; // only visible to this module, not submodules, replaces the current default
less keywords for the newcommer to throw in, and more control by adding more keywords.
use ....; // inherited by submodules - same as proposed 'inherit', but it's the default.
...another thing where "I thought it already worked that way" (and not just for use
). Probably because of intuitions carried over from C++: I was thinking nested mod
s in Rust are like nested classes in C++, and in terms of privacy I think they are, but apparently not in terms of what things are in scope.
(And I know I've relied on this mistaken assumption in at least one other comment somewhere, and no one called me on it. Please do, if you notice!)
It'd be sad to use priv
for this since we're so close to having it be
gone entirely. I'm also not sure I want it being the default... I really
like that I can tell what's in scope by looking at the top of the mod, for
the most part. Maybe it's not that bad though.
On Mon, Mar 3, 2014 at 8:02 AM, Gábor Lehel notifications@github.comwrote:
use ....; // inherited by submodules - same as proposed 'inherit', but it's the default.
...another thing where "I thought it already worked that way" (and not just for use). Probably because of intuitions carried over from C++: I was thinking nested mods in Rust are like nested classes in C++, and in terms of privacy I think they are, but apparently not in terms of what things are in scope.
(And I know I've relied on this mistaken assumption in at least one other comment somewhere, and no one called me on it. Please do, if you notice!)
— Reply to this email directly or view it on GitHubhttps://github.com/mozilla/rust/issues/11745#issuecomment-36507887 .
cc @flaper87
(To be clear I wasn't implying the C++ behavior is better.)
@FlaPer87 you know there's a Subscribe
button at the top right? :)
@glaebhoerl yes, but 1) I'm subscribed to all rust notifications (yes, I go through pretty much all of them) and 2) the cc helps with my email filters, tags etc ;)
I am not sure what the chances are, but if it could be improved, it should be before 1.0. Rust's module system has a steep learning/doing curve. I think there are two reasons for this:
mod
anduse
. This leads to the shadowing rules, that have weird consequences: you need to import a path viause
which is not even defined at this point in time, becauseuse
needs to beforemod
. I find this not only unfamiliar but also counterintuitive.use
statements are relative to the crate, the meaning changes depending on which file you currently compile. The consequence is that it makes it more difficult than necessary to first write filea.rs
, thenb.rs
(usesa.rs
), and thenc.rs
(usesb.rs
) and make them compile individually. Even in Haskell that is easy. In Rust you either have to go back and forth and adjust theuse
-staments or only put themod
-statements in a dedicated top-level crate (lib.rs
ormain.rs
) which defies encapsulation.This could be an issue that is fixable with more/better documentation. However, when you need a lot of words/actions to describe/do something in one system which does not need that care in another system, that could be an indication that there is some accidental complexity. I think this complexity is accidental since i am not yet aware of which gains are compensating for this.