Tracking issue for RFC 2126: Clarify and streamline paths and visibility

aturon commented 7 years ago

This is a tracking issue for the RFC "Clarify and streamline paths and visibility" (rust-lang/rfcs#2126).

Steps:

[ ] Implement the RFC. The various implementation steps are broken out into subissues (or will be, in some cases).
- [x] permit foo.rs or foo/mod.rs to support submodules like foo/bar.rs #45385
  - (mentoring instructions here)
- [x] introduce crate as a visibility modifier #45388
  - (mentoring instructions here)
- [x] allow crate to begin an absolute path https://github.com/rust-lang/rust/issues/45477
- [x] add a lint for pub items that are not visible from crate root https://github.com/rust-lang/rust/issues/45521
[ ] Some implementation issues are blocked on permitting crate to begin an absolute path. Neither of these items would take effect unless we are using that extension:
- [x] add a lint for absolute paths that do not begin with crate (https://github.com/rust-lang/rust/issues/48722)
- [ ] allow eliding extern crate when feature gate is given
  - Rationale for delaying: if we require all local paths to begin with crate, we don't need to worry about fallback
  - This feature also depends on Allowing renaming of crates in the dependencies section
  - [x] Add epoch lint for this https://github.com/rust-lang/rust/issues/48719
[ ] Adjust documentation (see instructions on forge)
[ ] Stabilization PR (see instructions on forge)

Unresolved questions:

How should we approach migration? Via a fallback, as proposed, or via epochs? It is probably best to make this determination with more experience, e.g. after we have a rustfix tool in hand.
The final syntax for absolute paths; there's more bikeshedding to be done here in a context where we can actually try out the various options. In particular, there are some real advantages to having both crate:: and extern:: paths, but ideally we could do it in a more succinct way.
- Currently, we accept ::crate::foo only. However, we also parse struct Foo(crate ::foo) as an illegal path, rather than treating crate as a visibility modifier. @petrochenkov describes some of the reasoning in this comment.

retep998 commented 7 years ago

That RFC has 4 distinct features and yet we still only have one tracking issue for all of them. Can we please not shove together disparate features like this?

aturon commented 7 years ago

@retep998 As explained throughout the RFC discussions, these features are connected through global design considerations. E.g., providing an external mechanism for renaming crates is motivated in part by the eventual deprecation of extern crate. We can and will gate various aspects separately (and will likely have some overlapping gates for trying out different syntaxes), but for discussion on the overall design and stabilization, it's important to keep global coherence in mind.

retep998 commented 7 years ago

Having an external mechanism to rename crates is something which has already been desired and needed for years (https://github.com/rust-lang/cargo/issues/1311) and could have stood alone just fine, but instead it's being used as nothing more than a pawn to support killing off extern crate.

We've had no problems with having separate RFCs for closely related features in the past (the RFCs for repr(align(N)) and repr(packed(N)) come to mind), yet now we're claiming that changing foo/mod.rs to foo.rs is so closely related to extern:: and crate:: that they have to be in the same RFC and use the same tracking issue?

rpjohnst commented 7 years ago

Just to make sure this point sticks around, since it's a relatively subtle aspect of the syntax unresolved question: Using crate as a visibility modifier as well as a path prefix introduces a parsing ambiguity between crate ::absolute::path and crate::relative::path. Using a contextual keyword introduces the same ambiguity, and there are no other reserved keywords that really make sense as visibility modifiers.

I would thus like to (at least) experiment with leaving out the crate visibility modifier and sticking with the existing pub(crate).

rpjohnst commented 7 years ago

I wouldn't mind splitting the foo.rs/foo/mod.rs point into a separate tracking issue, since it truly does seem independent of the path and visibility modifier changes.

As far as external crate renaming... there already is a separate issue for it? It is pretty important for the path changes so I think it's fine being part of this one as well.

neon64 commented 7 years ago

Despite lots of discussion a few weeks ago regarding the choice of crate as a visibility modifier (not a path prefix), I'm disappointed to see that although this choice of keyword was listed as an 'unresolved question' in the RFC, it has now been apparently forgotten. I myself, and several others I've noted, find this choice confusing since it isn't an adjective/modifier, and can also be ambiguous: does crate mean part of the crate's public API? No! It means local or internal or pub(lished) to the crate (hence why I prefer the latter keywords). So I'm not demanding immediate change, but at least acknowledge it as an unresolved question in this tracking issue, so that it is not forgotten upon stabilisation.

It's great that this module redesign has progressed so far, but at the same time it's important that we don't blunder ahead just to make the 'impl period', and end up making decisions without consulting the majority of Rust users. And, unfortunately, I think the people involved in Github RFC discussions aren't representative of the whole user base, because of the sheer flood of information/comments/opinions there can be discouraging. So somehow that needs to be dealt with too.

lilianmoraru commented 7 years ago

It isn't clear how the implementation ended-up...

vitiral commented 7 years ago

@rpjohnst @retep998 I have opened a new RFC to discuss foo.rs + foo/ as well as propose a few improvements to this RFC.

Edit: I suggest we open another RFC to discuss crate as a visibility modifier. Personally I would like to see the opposite: pub(extern) be added and required for all externally published symbols. This would make bare pub be the equivalent of pub(crate) in the next epoch.

est31 commented 7 years ago

@rpjohnst

introduces a parsing ambiguity between crate ::absolute::path and crate::relative::path

Isn't crate ::relative::path invalid code anyway? Can't it be rejected during parsing?

rpjohnst commented 7 years ago

@est31 No, that requires doing name resolution during parsing which is definitely not something we want to do. It's one of the more annoying parts of parsing C and C++, which have similar ambiguities around a * b being a multiplication or a declaration, and around a b(c, d); being a variable declaration or a function prototype.

If we ever start allowing dependencies and top-level modules with the same name, even tangling up name resolution with parsing won't save us- crate :: some :: item could be a crate-visible item from dependency some, or a private item from top-level module some.

To keep this in perspective, we could just arbitrarily resolve the ambiguity one way or the other and require parentheses to write the other case (probably the crate-visibility absolute path case, which seems rarest), but that's still a foot-gun that we don't need if we stick with pub(crate), which already solved the syntactic ambiguity.

petrochenkov commented 7 years ago

Using crate as a visibility modifier as well as a path prefix introduces a parsing ambiguity between crate ::absolute::path and crate::relative::path.

This is a tiny issue, IMO, given that both visibilities on tuple struct fields and "inline" absolute paths are rare. Paths are always parsed greedily currently, so it makes sense for crate :: x :: y to mean crate::x::y. If the opposite meaning is wanted, then pub(crate) ::x::y or crate (::x::y) can be used.

est31 commented 7 years ago

@rpjohnst @petrochenkov could you please share a code example where pub ::relative::path is valid code? I don't get the entire ambiguity thing because I think one of the two cases is invalid.

Edit: the only place where this could be relevant is inside macros when you match for a visibility qualifier + a path. But that's a very small breakage so acceptable IMO.

rpjohnst commented 7 years ago

@est31 You seem to have the wrong idea- it's not about relative paths, they're never an issue. It's about building the AST before you know what any of the names refer to. Here's a full sample:

struct S(crate :: x :: y);

Given that you're not allowed to look up any of those names yet, how do you convert that string of characters into an AST? There are two possible answers. One has a private field of a type y defined in module x of the current crate. The other has a crate-visible field of a different type y defined at the top level of dependency x. No macros needed.

est31 commented 7 years ago

@rpjohnst I see thanks for clarifying. Its indeed an ambiguity.

CasualX commented 7 years ago

@rpjohnst

A simple solution is to define it unambiguously parse to the crate visibility modifier. If you want to parse it as a tuple struct with private member of the given type, do it like this:

    struct S(:: crate :: x :: y);

(note the :: prefix to indicate the root 'namespace')

This is consistent with how other root namespaces need to be referred to in submodules (eg ::std::x::y).

rpjohnst commented 7 years ago

Just disambiguating to crate-as-a-visibility would be somewhat surprising, I think. But it might be a good idea to "canonicalize" that workaround a bit and enforce that crate:: is always used with a leading :: (outside of uses of course). As you point out, this increases symmetry with ::std::x::y-like paths, and leaves the un-prefixed form for truly relative paths (self::/super::/something_in_scope::).

Should we do this? Allowing crate::x::y in non-use paths makes crate kind of magical, while enforcing ::crate::x::y scopes it to the same level as dependencies, which is what we're trying to imply anyway.

petrochenkov commented 7 years ago

@rpjohnst

enforce that crate:: is always used with a leading :: (outside of uses of course)

This may be reasonable, at least for a start (nothing prevents relaxing this later).

Zoxc commented 7 years ago

I'm not a fan of the use extern::bar::foo or use crate::bar::foo syntax. It seems very noisy. I'd prefer some sugar for this. I suggest extern bar::foo for this.

zengsai commented 7 years ago

How about add one more implicit rule? Automatically import extern crate into root namespace, make a more coincident path expression.( I'm not good at English, pls learn my view from the following example)

e.g.

Our project structure like this:

src |--lib.rs |--foo.rs |--foo |----|--bar.rs

you config a dependence in Cargo.toml, like [dependencies] serde = "3.0.0"
we add mod bar into mod foo, and add mod foo into lib.rs,
we define a function finlib in lib.rs, define a function finfoo in foo.rs, define a function finbar in bar.rs

Make the extern crate's scope as the top level module in crate, then we can write code like this anywhere.

for fully/qualified path

::serde::Deserialize
::serde::x::y
::finlib                            // not ::crate::finlib
::foo::finfoo                   // not ::crate::foo::finfoo
::foo::bar::finbar           // not ::crate::foo::bar::finbar

relative path write like this

serde::Deserialize      // no need to write `use serde`
serde::x::y                   // no need to write `use serde`
finlib                          
foo::finfoo                  
bar::finbar

we first look up serde\finlib\foo in current mod scope, if not found lookup in supper mod, until the root namespace. if there have name conflict, we write fully path instead.

also we can use self::bar::finbar in foo.rs to avoid name look up.

rpjohnst commented 7 years ago

I'm not able to find the earlier thread that discussed this, but the compiler used to work that way and it caused major problems for name resolution. IIRC @pcwalton or @arielb1 probably know more.

neon64 commented 7 years ago

Wouldn't the easiest solution to this ambiguity be to use a different keyword for the crate-local visibility modifier (ie: thepub(crate) replacement). For example, local, as suggested many times before in previous RFC discussions. Then struct S(local :: x :: y) is distinct from struct S(crate :: x :: y).

rpjohnst commented 7 years ago

That just introduces another ambiguity between <visibility> <absolute path> and <relative path starting with the new keyword>, since the new keyword would have to be contextual.

neon64 commented 7 years ago

@rpjohnst ah damn.. But isn't that a problem the epochs were designed to solve? Eg: in the current epoch we just use pub(crate), and then, in the next epoch, introduce a new non-contextual keyword that is more 'ergonomic'.

est31 commented 7 years ago

@neon64 yes but its a claim by the RFC that it doesn't require a new epoch. That claim seems to not hold.

rpjohnst commented 7 years ago

It holds just fine- there is currently no code that uses the crate visibility or crate:: paths, so only new code will be affected. As long as we pick a resolution and stick with it there's no compatibility problems.

glaebhoerl commented 7 years ago

One thing that's occurred to me, with apologies if it's been brought up before in one of the discussions (I don't recall having seen it in the last couple):

In the latest proposal, we have crate:: to refer to "in this crate" and self:: to refer to "in this module". This is slightly inconsistent and potentially less clear than it could be: for one thing, where self refers to "this", it's not inherently obvious "this what", and conversely, from the fact that there is a self:: which means "this module", one might infer that crate:: could mean "some other crate", which is a confusion that has in fact been mentioned in the thread.

One possible solution would be to phase out self:: in favor of mod::. Then we'd have crate:: to mean "in the nearest enclosing crate", and mod:: to mean "in the nearest enclosing module", and it would be clear and consistent. We could also potentially solve the issue where it's not possible to refer to items within an fn scope in a qualified way at all by introducing an fn:: prefix, to unsurprisingly mean "in the nearest enclosing fn". (I haven't thought about whether it might make sense to go further, and also have things like trait:: or impl::.)

nikomatsakis commented 7 years ago

@glaebhoerl that's an interesting proposal. Personally, I think I have come around to the idea of introducing a new syntax for absolute paths and deprecating the existing one. I'm not sure what this syntax should look like, but I will describe one possibility below (that I know has been floated before).

Imagine we had the following grammar for paths:

Path = AbsolutePath | RelativePath
AbsolutePath = 
    | `@` ID? 
    | `@` ID? `::` RelativePath
    | `self` :: RelativePath
    | `super` :: RelativePath
RelativePath = ID (`::` ID)*

Under this grammar, one would reference things from other crates by beginning with @crate, for example:

use @std::collections::HashMap;

One would reference the local crate with just @, for example:

use @::something::in::my::crate;

(And use self::something remains same as today, for better or worse.)

Something nice about this is that absolute paths are completely distinguishing from relative paths, which means we now have the desirable property that one can copy-and-paste a path from a use into code and it works:

fn foo() {
    @std::cmp::min(a, b) // OK
}

This is not true today, which definitely confuses me from time to time.

We also eliminate the parsing ambiguity around crate, so you can do pub Foo(crate @::MyType) or whatever and it works fine.

(In general, having relative and absolute paths start with the same :: has been a source of pain many times, is the truth.)

One thing I don't know is whether @foo is the best. The other option that I think works and which I've thought of is []:

[std]::collections and [crate]::collections // []::collections seems too weird.

rpjohnst commented 7 years ago

I really don't think we ought to introduce any new sigils just for paths. Enforcing a leading :: in ::crate::foo is enough to address @glaebhoerl's comment and to eliminate the parsing ambiguity. It also fits the intended mental model for crate:: more closely- it's a stand-in for the name of the current crate, not a path prefix like self::.

(In general, having relative and absolute paths start with the same :: has been a source of pain many times, is the truth.)

I'm not sure what you mean by this. I don't think any relative paths start with ::, do they?

est31 commented 7 years ago

@nikomatsakis @glaebhoerl independent of whether I like your suggestions or not (I actually can live with both), can we please stick to the RFC? It has been discussed back and forth in plenty ways and I don't think that opening up debate again would help anybody. A total of 82 people have commented on the thread of the latest RFC (there were quite some discussions before), many feeling very strongly about it. I think it would be unfair to them to sneak in a last-minute change of the proposal at this point, without giving them a chance of reviewing the change. Especially I'm a bit confused because in #44721 @nikomatsakis shut down discussion about the feature with the argument "let's only talk about the implementation please".

The @ notation and [] notation have both been proposed (with me being a supporter of both) but they eventually didn't make it, at least that's my impression, due to negative responses from users.

Ichoran commented 7 years ago

Even if there are no cases at all where a leading :: can be confused with a path that starts on a preceding identifier and has an internal ::, it isn't very distinct visually.

Also, it's less suggestive. @ is a better mnemonic. With a leading :: you have to wonder whether it's like filename paths, where leading / means root (if you're on a Unixy system), or if it's a standard abbreviation, where no leading thing would mean "we left the name out because it's relative". With a moment's thought you can see that the former makes more sense, but the point is that it takes a moment's thought. @ (or somesuch) is instantly distinguishable to both the compiler and the coder, making it easier to instantly grasp what is going on.

rpjohnst commented 7 years ago

How is a leading @ easier to distinguish than a leading ::? :: is already an established mechanism both in Rust and C++!

Ichoran commented 7 years ago

@rpjohnst - I said how in the very sentence you are responding to! @ is distinct visually. To expand: it jumps out at you. There is no danger of leading vs. internal positioning being confused in your mental model of the token stream.

rpjohnst commented 7 years ago

@ is a completely new token with no meaning whatsoever, and would take far more than "a moment's thought" to figure out. Confusing a leading :: for an internal :: isn't a problem to begin with- there is no reason anyone would have to take "a moment's thought" to dismiss the idea that a leading :: might be a relative path.

Ichoran commented 7 years ago

@rpjohnst - Your statements are true given sufficient training on what :: means (which is provided by experience with C++ or Rust). I was speaking of learnability and intrinsic difference. "The same symbol used in a different context" is never going to be as distinguishable as "a unique symbol".

I could accept an argument that it's not worth burning @ as a unique symbol on a use case like this where there is a tolerable alternative.

sanmai-NL commented 7 years ago

@Ichoran: I think that adding new token for a certain semantics (@) to the Rust grammar is not a step to be taken lightly, and I have a slight suspicion we are too close to doing that with @nikomatsakis and others' proposal. For one, because we then cannot reuse it for another purpose later in the life of the language (I suppose paths are so pervasive in the code that the @ token could appear in many places). I also wonder what the effect is on the perception of the complexity of Rust code. To a novice not used to e.g. C++, I believe (but don't have research on this) Rust is language with a quite baroque, intimidating notation. One of the strategic goals for this year is to optimize the learnability and I suppose approachability of Rust. So maybe we have to consider this when fancying @ because the intuition is it would stand out in code (which @rpjohnst I think rightfully questions). Am I clear enough what I mean with this?

(Feel free to correct or clarify anything I claimed, since I simply do not have time to follow the discussion on the Rust grammar all that closely.)

nikomatsakis commented 7 years ago

@est31

independent of whether I like your suggestions or not (I actually can live with both), can we please stick to the RFC?

Basically I agree. I don't really want to get involved in big discussions right now -- it's the impl period after all! -- but I did want to put the @ idea "out there" to be simmering in the background. In part, because I tend to get kind of forgetful, and writing things out helps me to remember them later.

My feeling is that at this point the right thing to be doing is:

Implementing the RFC roughly as written.
- We may want to, for example, adopt @rpjohnst's suggestion of ::crate for absolute paths, just to resolve the parsing disambiguity, because it seems like a very small delta, and actually increases consistency overall.
Gain experience in using it.
Revisit the design, keeping in mind alternatives that have been floated in the meantime, and make final stabilization decisions.

This was the spirit in which I meant to write my comment, though I guess I didn't make that clear. Perhaps it would have been better to keep it in a private file. :woman_shrugging:

EDIT: Also, I'm aware that @ and [] were raised earlier, though I don't recall whether the @::foo notation was mentioned before for being relative to the current crate. So I don't mean to claim authorship of the idea.

glaebhoerl commented 7 years ago

OTOH, the idea I mentioned had not been raised previously, and is not so much a modification to the RFC as an extension of it. Also, "The final syntax for absolute paths; there's more bikeshedding to be done here in a context where we can actually try out the various options." is explicitly listed as an unresolved question in the issue body. I agree that, in general, we should try to avoid re-litigating accepted RFCs, but I don't feel this situation is comparable to the one we had with e.g. inclusive ranges (ugh).

nikomatsakis commented 6 years ago

rust-lang / rust

Tracking issue for RFC 2126: Clarify and streamline paths and visibility #44660