rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
98.35k stars 12.72k forks source link

Tracking issue for `pub(restricted)` privacy (RFC #1422) #32409

Closed nikomatsakis closed 7 years ago

nikomatsakis commented 8 years ago

Tracking issue for rust-lang/rfcs#1422

RFC text

Milestones:

cuviper commented 7 years ago

@nikomatsakis

I think that is ambiguous too. Consider struct Foo(pub@foo::bar::baz) -- is that struct Foo(pub@foo (::bar::baz))? Or struct Foo(pub@foo::bar (::baz))?

Yes, there would need to be something to disambiguate this. It's enough just to parenthesize the type as you've done, if we don't need to support a list of visibility scopes, otherwise that will need some kind of bracketing to group it. I think eager parsing the bare path is fine, as @petrochenkov suggests, especially if the "missing type" error can suggest the disambiguation.

pnkfelix commented 7 years ago

@petrochenkov wrote:

allow both pub(crate) and pub(super), but for arbitrary paths, one writes pub(in path)

Note that super currently is an "arbitrary path" too, you need one more special case to make this work.

I don't think I made myself clear.

The idea was that pub(in <path>) is a very general form (so yes, pub(in super) would be legal, as would be pub(in super::super) or pub(in self) or pub(in a::b), et cetera).

But the two presumably most-common cases get special in-free sugar: pub(crate) and pub(super).

pnkfelix commented 7 years ago

But the two presumably most-common cases get special in-free sugar: pub(crate) and pub(super).

Having said that, I actually would probably be fine with an even sweeter sugar, crate fn foo(...) (which would desugar to pub(in crate) fn foo(...)). (In this case, only the crate gets a sugar, not super. But maybe someone can talk me into the latter.)

I think @nikomatsakis and @petrochenkov have done a fine job laying out why it is useful to provide the full-featured form. Its just a question of how similar we want all the variations of the syntax to look (or, put another way, how much dissimilarity are we willing to put up with to keep the parsing simple).

petrochenkov commented 7 years ago

@pnkfelix

But the two presumably most-common cases get special in-free sugar: pub(crate) and pub(super)

Nit: pub(super) still has the "tuple struct" issue - pub struct S(pub (super::S, super::Z));, but it's solvable by couple of tokens of lookahead. I'm still not happy with pub(in path) (see https://github.com/rust-lang/rust/issues/32409#issuecomment-270942220 for my preferred alternative), but I think I can tolerate it if the most common cases pub(super), pub(crate) and pub(self) are all allowed without in.

nikomatsakis commented 7 years ago

I'm 👍 on either pub(super)/pub(crate)/pub(self) or crate fn. I think I lean towards pub(X) and friends because it permits default(X) as well (for constraining the scope of specialization lexically).

kornelski commented 7 years ago

crate fn looks great to me. It's simple. It's a model familiar from other languages where it works well enough.

And it encourages splitting project into crates, rather than building monoliths with holes poked for arbitrary coupling.

aturon commented 7 years ago

I'm trying to figure out how to draw this discussion to closure. I think there's a rough consensus that the following are the two most plausible options:

I think most stakeholders have said they are OK with at least one of these two choices. Does anyone have strong objections to the second choice? If so, what are they? To me, it seems like it provides the best balance of clarity, expressiveness, extensibility, and tolerable syntax.

petrochenkov commented 7 years ago

For reference (I haven't seen this mentioned in this thread): pub(self) is also useful for fixing issues like https://github.com/rust-lang/rust/issues/32770 without reinventing priv.

withoutboats commented 7 years ago

My only concern on the second is that I'm not sure that pub(in path) is really pulling its weight. I'm pretty in favor of pub(crate)/pub(super)/pub(self).

I'm not saying you'd never want it, I'm just saying maybe you don't want it as bad as you want to not have to think about the privacy scopes you can create it with.

aturon commented 7 years ago

@withoutboats Do you think that supporting all of crate/super/self, plus the points about working with default, are enough to still argue in favor of the pub(restricted) syntax?

I'll note that we can stabilize the pure-keyword versions first, and take longer on the in variant.

withoutboats commented 7 years ago

@aturon not sure how I feel about default(crate), but yeah I do think pub(keyword) is probably the best way forward for now. Are you thinking about stabilizing part of this soon?

Since AFAIK the full pub(path) is currently implemented, it seems like the action item is to go with pub(in path) and then consider stabilizing just the pub(keyword) part after that's done.

petrochenkov commented 7 years ago

Can I get some opinions on pub@path suggested by me and @cuviper ?

I still think it's the least noisy and the most nicely looking syntax from suggested in this thread, it also doesn't require special cases for crate etc and extensible to default@path. Parsing ambiguity pub@path :: type can be trivially resolved by greedy parsing, there are already few places in the syntax where this is done.

petrochenkov commented 7 years ago

It just occurred to me.

Syntactic ambiguity exists only in tuple structs, so let's require disambiguation in only in tuple structs (i.e. almost never). It will be permitted in other contexts, but not required.

Pros: 99% of code working today continues working; in 99% of cases the syntax is still concise; no special cases for super/self; still extensible to default(path); there are no problems with $vis matchers in macros.

jimmycuadra commented 7 years ago

That's not a bad idea. It's similar to how you have to write (foo,) to disambiguate a tuple with one element from (foo). It's (AFAIK) the only place this syntax is used because it's the only place it's needed.

petrochenkov commented 7 years ago

A much more popular analogy would be the disambiguating "turbofish" ::< in paths required only in expression contexts. (It's not currently permitted in type paths and $path matchers, but that's an artificial restriction.)

kornelski commented 7 years ago

I'd say turbofish and comma in tuples are unfortunate quirks, and shouldn't be used as a precedent for adding more syntax quirks.

pub(self)/pub(super) can be added now, without deciding on fate of path (and if path turns out to be needed, then it can be later decided whether it should always have the prefix or only somtimes).

leoyvens commented 7 years ago

@pornel It would be weird to stabilize pub(self) as a special case, it would just be a contrived way of saying priv. It would be more natural that pub(self) comes as a special case of pub(path).

arielb1 commented 7 years ago

@pornel

Value paths are different from type path in other ways (e.g., the handling of an empty type parameter list), so it's not a parse-only distinction.

kornelski commented 7 years ago

I don't see a problem with pub(self). It's actually a great description of how it works exactly (I'm surprised that "private" struct fields are accessible from outside of impl Struct - this is different from how private class properties work in other languages).

kornelski commented 7 years ago

I suppose stabilising pub(keywoard) without paths would be a problem if it turned out that e.g. pub<path> parses better.

eddyb commented 7 years ago

pub<path> couldn't be used because pub <T> ::A parses right now (as pub <T>::A) in tuple structs.

Frankly, it feels like the struct/enum private/public defaults were misguided and a better choice would've been to have private/public defaults based on constructor shapes, i.e. "tuple" structs and enum variants would both be public-by-default, supporting no pub syntax, and everything else private-by-default.

aturon commented 7 years ago

@petrochenkov

Can I get some opinions on pub@path suggested by me and @cuviper ?

I still think it's the least noisy and the most nicely looking syntax from suggested in this thread, it also doesn't require special cases for crate etc and extensible to default@path. Parsing ambiguity pub@path :: type can be trivially resolved by greedy parsing, there are already few places in the syntax where this is done.

I'm personally open to this option, though I have a weak preference in favor of the ()-based syntax. In particular, I personally find it easier to visually parse:

pub(module::submodule) fn foo();
pub@module::submodule fn foo();

default(module::submodule) fn foo();
default@module::submodule fn foo();
aturon commented 7 years ago

@petrochenkov

Syntactic ambiguity exists only in tuple structs, so let's require disambiguation in only in tuple structs (i.e. almost never). It will be permitted in other contexts, but not required.

That seems reasonable to me. If we want to consider this route, I'd suggest first stabilizing the explicit in syntax and seeing whether it's painful enough in practice to drop it in most locations.

withoutboats commented 7 years ago

I strongly don't think requiring the in only in tuple structs is a good idea. While the language has some disambiguating quirks (trailing tuple comma and turbofish), this feels qualitatively different in a way I have difficulty explaining. It feels far more arbitrary to require an in keyword or not depending on the shape of the struct its being used in than either of those feel.

nikomatsakis commented 7 years ago

Even though I value it, I think the use of explicit in will be quite rare -- pub(super) and pub(crate) will predominate. I say we should go with pub(super), pub(crate), and pub(in path) and live with it.

Personally, I wish that tuple struct fields inherited the privacy of the struct, but ... it's a bit late to change that. =)

petrochenkov commented 7 years ago

@withoutboats My general argument against solutions like "pub(in path) everywhere" is that they are not "zero-cost". Tiny corner case affects all the remaining language and makes it worse, even if it's never used itself. I want a solution that doesn't suffer from this issue. "Optimize for the common case", remember? Even if a disambiguation quirk looks bad aesthetically when placed under the magnifying glass, it will still have zero effect in practice. (My suggestion doesn't even look so bad, quite otherwise, it makes the grammar simpler.)


P.S. mod may be a good alternative to in. It gives a simple memo - if you want to disambiguate between a module and a type, use mod to tell it's a module (kinda like typename and template in generic contexts in C++).

durka commented 7 years ago

Please don't add anything that's "kinda like typename and template in generic contexts in C++" :)

On Wed, Feb 1, 2017 at 5:23 PM, Vadim Petrochenkov <notifications@github.com

wrote:

My general argument against solutions like "pub(in path) everywhere" is that they are not "zero-cost". Tiny corner case affects all the remaining language and makes it worse, even if it's never used itself. I want a solution that doesn't suffer from this issue. "Optimize for the common case", remember? Even if a disambiguation quirk looks bad aesthetically when placed under the magnifying glass, it will still have zero effect in practice. (My suggestion doesn't even look so bad, quite otherwise, it makes the grammar simpler.)

P.S. mod may be a good alternative to in. It gives a simple memo - if you want to disambiguate between a module and a type, use mod to tell it's a module (kinda like typename and template in generic contexts in C++).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rust-lang/rust/issues/32409#issuecomment-276802580, or mute the thread https://github.com/notifications/unsubscribe-auth/AAC3n2uj81O9nkSSSdCQtCUqBfYz7OI2ks5rYQX_gaJpZM4H1fnj .

withoutboats commented 7 years ago

@petrochenkov I consider requiring in everywhere simpler than requiring in only in one location this feature could be used. Its not an aesthetic consideration; to be honest every argument you make is exactly what I would make in the opposite direction.

solson commented 7 years ago

My initial reaction is I don't like mod there since everywhere else it means to define a mod, not refer to one.

I also find pub@path more noisy and uglier than other suggestions, despite all their problems. The @ is a nasty character.

I wish I had more constructive suggestions for the syntax.


I was thinking about the semantics of this feature in general the other day, and the fact is I would only use pub, pub(crate), and the default private.

In my ideal world, pub on an item would mean "this item is exported to other crates", i.e. there is a path to the item where every module along the path is pub, and the compiler would error/warn if this was not the case. Then I would use pub(crate) fn (or crate fn) to mark those things that are pub today but not truly accessible from outside the crate.

This would let a reader tell at a glance whether something is truly exported to external crates. It takes work to figure that out today. I don't know if we can make my ideal world happen, but I currently try to follow it as a convention in my code (on nightly where pub(crate) is available). There's no static checking that pub means exported, but I try to make it true manually.

Lastly, I am uninterested in making items public to a module subtree of the current crate (i.e. pub(path)). I typically avoid deep module hierarchies where this would make sense.

solson commented 7 years ago

In summary, I would prefer a world where we had:

Anyone agree/disagree? Maybe I've missed some compelling need for pub(path), but so far I don't feel like it's a big win.

petrochenkov commented 7 years ago

I wish I had more constructive suggestions for the syntax.

Feel free to choose! (Tokens that can't start a type and are not whitespaces, or closing delimiters, etc)

= <= == != >= > || ~ @ . .. ... , ; : -> <- => # $ { LITERAL + - / % ^ | >> += -= *= /= % ^= &= |= <<= >>= as box break const continue crate else enum false if in let loop match mod move mut pub ref return static struct trait true type use where while

durka commented 7 years ago

pub |path|?

On Wed, Feb 1, 2017 at 5:57 PM, Vadim Petrochenkov <notifications@github.com

wrote:

I wish I had more constructive suggestions for the syntax.

Feel free to choose! (Tokens that can't start a type and are not whitespaces, or closing delimiters, etc)

= <= == != >= > || ~ @ . .. ... , ; : -> <- => # $ { LITERAL + - / % ^ |

+= -= *= /= % ^= &= |= <<= >>= as box break const continue crate else enum false if in let loop match mod move mut pub ref return static struct trait true type use where while

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rust-lang/rust/issues/32409#issuecomment-276810479, or mute the thread https://github.com/notifications/unsubscribe-auth/AAC3n_RXT3S0zZ5i_lWC7LANws1XeQN5ks5rYQ3SgaJpZM4H1fnj .

kornelski commented 7 years ago

pub in crate, pub in some::path, pub in super looks neat. It reads like English, and feels consistent with equally punctuation-free pub use foo as bar.

solson commented 7 years ago

@pornel It looks a lot less neat in context, e.g.

pub in foo::bar::baz quux(x: i32) {

Seems pretty hard to scan quickly. Delimiters are nice.

kornelski commented 7 years ago

But paths allow parens! This automatically becomes allowed:

pub in (foo::bar::baz) quux(x: i32) {

And paths are an edge case. Usual case is fine:

pub in super quux(x: i32) {
eddyb commented 7 years ago

But paths allow parens! This automatically becomes allowed:

Nope! Both types and expressions have parenthetical forms, but they're not specific to paths.

solson commented 7 years ago

@pornel I find that "usual case" also harder to scan than a delimited version. It's particularly weird because fn can be preceded by a variety of keywords, so we are used to reading them as somewhat disconnected, e.g.

pub in super unsafe extern fn foo(x: i32) {

vs

pub(super) unsafe extern fn foo(x: i32) {

The latter feels more structured.

withoutboats commented 7 years ago

I like the parens because it feels like a caveat added to the pub.

kornelski commented 7 years ago

I like super unsafe tho ;)

nrc commented 7 years ago

I'm in favour of pub (in ...), especially with the shorthand of skipping the in for crate, that feels like a not bad option. Using a sigil (@, etc.) or a sequence of words (pub in ...) feels harder to read and noisier than the parens option.

aturon commented 7 years ago

@solson

Anyone agree/disagree? Maybe I've missed some compelling need for pub(path), but so far I don't feel like it's a big win.

To recap a few points in the thread:

None of this precludes also having crate fn as sugar on top.

aturon commented 7 years ago

In the interest of making progress, I'm going to propose we head to FCP, with the intent to go with pub(crate), pub(self), and pub(in path) as the syntax, just to get the lang team on record with this. Note that this is just proposal to go into FCP; even if we do so, there will be several weeks to keep bikeshedding :-) But I think we should aim to stabilize this feature with some syntax this cycle.

If you strongly object to the proposed syntax (keeping in mind we can add addition sugar like crate fn later), please speak up.

@rfcbot fcp merge

rfcbot commented 7 years ago

Team member @aturon has proposed to merge this. The next step is review by the rest of the tagged teams:

No concerns currently listed.

Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

withoutboats commented 7 years ago

Does super need to be pub(in super) or is it also pub(super)?

aturon commented 7 years ago

@withoutboats also pub(super). Basically pub(keyword) or pub(in path).

withoutboats commented 7 years ago

Great, I was unsure if super is a general keyword or just a special path.

I'm in favor of stabilizing only pub(keyword) while leaving pub(in path) unstable & getting a sense of the impact. I'm still uncertain that pub(in path) is a net win & I'd like to see if its still a pain point once we have the 90% case resolved.

nikomatsakis commented 7 years ago

@withoutboats

I'm still uncertain that pub(in path) is a net win & I'd like to see if its still a pain point once we have the 90% case resolved.

Can you elaborate a bit more on what negative you see from pub(in path)? It seems to me that pub(in path) leads to a cleaner overall mental model. That is, things are either public "to the world" or they are public "to a point in this crate (possibly the root)". If you have pub(super), pub(crate), private, and pub all mixed in, it is a rather more chaotic picture to my mind. Or, to put another way, you have the same mental model underneath, but you lack the ability to control it uniformly.

If we were going to just go for crate, pub, or private, I could see that. But then (a) we lose the ability to do things like default(crate) and (b) we fail to support pub(super), which is undeniably a common use cases, so to me it feels less good.

I think I must also write deeper crates than others here. I find I very frequently want to make use of nesting. Here are some examples where I would employ something other than pub(crate):

withoutboats commented 7 years ago

@nikomatsakis

I'm not concerned about the mental model, I'm concerned about the kinds of publicity scopes we encourage users to define in their systems.

My experience has been that I have three kinds of items:

To me it seems like the question is about whether or not its feasible to implement most libraries through a pattern of recursive visibility - every module exposes a handful of types, defined in that module, to the outside world (pub(super)), built from types defined in its submodule that only that module needs to use. Or do we need to be able to push items 2 or more levels up, which is what pub(in path) would be for?

If people end up pushing their items up those levels using re-exports, then pub(in path) clearly reduces the complexity of the library's API. But if we can encourage people to use this sort of recursive scoping I think their projects will be easier to onboard to.

The limit of my understanding is in the low five digits of LOC, so its plausible to me that pub(in path) is an inevitability as a project reaches a certain size.

nikomatsakis commented 7 years ago

If people end up pushing their items up those levels using re-exports, then pub(in path) clearly reduces the complexity of the library's API. But if we can encourage people to use this sort of recursive scoping I think their projects will be easier to onboard to.

Interesting. My feeling is that pub(in path) would indeed be rather unusual, but useful in a pinch. I am not sure that it will have a big impact on how people structure their code, though; it seems like it's already an awkward enough syntax to be somewhat discouraging.

In any case, I'm ok with keeping pub(in path) feature-gated. My reluctance mainly stems from the fact that I dislike having "yet another" bit of obscure syntax that will linger in limbo due to the fact that, while it might be useful on occasion, it's not useful enough to have a dedicated constituency pushing for it (and yet perhaps not so unuseful as to be axed altogether).


There was one thing that @solson said that I wanted to return to (emphasis mine):

In my ideal world, pub on an item would mean "this item is exported to other crates", i.e. there is a path to the item where every module along the path is pub, and the compiler would error/warn if this was not the case....This would let a reader tell at a glance whether something is truly exported to external crates.

This last sentence sounds strikingly similar to the goals of the current system: if you see private, you know at a glance it is local. But without some lint you don't know how far something is exposed without research. I too find this suboptimal, because when I am auditing safety, I tend to want to know "how far is this exposed".

So, I guess I'm just saying that I think I would be in favor of a lint along the direction of the original priv-in-pub RFC here, that warns when an item is not "as exposed" as its publicness says it should be. I think we'd have to tinker a bit with the design: e.g., I think that if you say pub(super) struct Foo { pub f: usize }, that doesn't merit a warning, as the field is public but it's implicitly capped by the privacy of its struct. I guess this comes back to the private-in-public discussion that (iirc) never quite resolved itself.

withoutboats commented 7 years ago

My reluctance mainly stems from the fact that I dislike having "yet another" bit of obscure syntax that will linger in limbo due to the fact that, while it might be useful on occasion, it's not useful enough to have a dedicated constituency pushing for it (and yet perhaps not so unuseful as to be axed altogether).

This is fair. My feeling is that the compiler authors are among the most likely to want this syntax, since the compiler is one of the largest Rust projects, and so if it has advocates they'll be well positioned.


In line with @solson's comment, I sort of wish we had a system where mods were public by default and so an item's visibility declaration declares it visibility without needing to know who has visibility to the module its in.