rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
96.93k stars 12.53k forks source link

Tracking issue for `pub(restricted)` privacy (RFC #1422) #32409

Closed nikomatsakis closed 7 years ago

nikomatsakis commented 8 years ago

Tracking issue for rust-lang/rfcs#1422

RFC text

Milestones:

jonas-schievink commented 8 years ago

You probably meant rust-lang/rfcs#1422

nikomatsakis commented 8 years ago

@jonas-schievink indeed. :)

petrochenkov commented 8 years ago

How to parse struct S(pub (std::marker::Send + std::marker::Sync));? It's currently legal.

If we try to parse std::marker::Send as a part of pub(restricted) based on is_path_start(std) == true, then we end up in the middle of a type sum after parsing an arbitrary number of tokens with only a Path in hands and need to recover.

Similar cases with simpler disambiguation:

struct Z(pub (u8, u8));
struct W(pub (u8));

Note that pub(path) is a strict subset of pub TYPE (modulo pub(crate)), so the former can be parsed as the latter and reinterpreted later if necessary.

I suppose everything here can be disambiguated, but I'm concerned about the conflict pub(path) vs pub TYPE in general and how it affects formal properties of Rust grammar.

nikomatsakis commented 8 years ago

@petrochenkov sigh. a very good point. @pnkfelix and I were chatting-- one obvious (but not particularly appealing) option would be to say that tuple structs just can't use pub(restricted) paths. This would be better if we had the notion that one could write struct Z { pub 0: (u8, u8) } to use the more explicit name.

We considered various other synaxes one could use:

One consideration is that I expect most people will want to either make fields pub -- meaning, as public as the struct itself -- or private. So perhaps it's ok that you can't use tuple structs to express more flexible things -- they're kind of a convenience shorthand as is. But it is unfortunate however you slice it.

Stebalien commented 8 years ago

For reference, @nikomatsakis is referring to #1506.

nikomatsakis commented 8 years ago

On Sun, Apr 3, 2016 at 5:52 AM, Vadim Petrochenkov <notifications@github.com

wrote:

Note that pub(path) is a strict subset of pub TYPE (modulo pub(crate)), so the former can be parsed as the latter and reinterpreted later if necessary.

Hmm, I suppose that's true. (Or might be true)

nikomatsakis commented 8 years ago

So in the @rust-lang/lang meeting we discussed this problem, as well as the solution implemented in #33100. Our conclusion was that we really ought to explore the cover grammar approach described earlier by @petrochenkov here:

Note that pub(path) is a strict subset of pub TYPE (modulo pub(crate)), so the former can be parsed as the latter and reinterpreted later if necessary.

It is basically the only approach that lets restricted paths and tuple structs be fully integrated, without any artificial restrictions (like needing to convert to {} form, or needing to introduce an extra set of parens). It's a shame that the most natural grammar would no longer be LL(k) (nor LR(k)), but given that such an obvious cover grammar exists, that shouldn't cause a lot of problems in practice.

nikomatsakis commented 8 years ago

The ambiguous syntax has come up some more on https://github.com/rust-lang/rfcs/pull/1575, where it interferes with the idea of having a "visibility matcher". Some proposals were made for alternative syntax options there:

Most of these were covered (and rejected) in my prior comment, but it is true that the lookahead requirement is onerous.

Stebalien commented 8 years ago

We could always just use another keyword (scope, scoped, restrict, restricted, vis, etc...):

And all related variants...

durka commented 8 years ago

What would it take to change the syntax? Another RFC or just a PR?

nrc commented 8 years ago

@durka another RFC, although it could be a modification RFC that modifies the existing one (probably with some new motivation) rather than a completely fresh RFC

ExpHP commented 7 years ago

I posted on users.rust-lang with some confusion about this. The RFC talks a lot about preventing items from being reexported outside their scope, but so far as I can tell in the current implementation, pub(restricted) use doesn't reexport anything, period; only bare pub use does.

My users.rust-lang example boiled down to its essence: (playpen)

#![feature(pub_restricted)]
fn main() { }

mod far {
    mod outer {
        pub        use far::outer::space::hello_world;
        pub(crate) use far::outer::space::hello_crate;
        pub(super) use far::outer::space::hello_far;
        pub(self)  use far::outer::space::hello_outer;

        mod space {           
            pub        fn hello_world() {}
            pub(crate) fn hello_crate() {}
            pub(far)   fn hello_far()   {}
            pub(super) fn hello_outer() {}
        }

        mod bar {
            use super::hello_world;  // ok
            use super::hello_crate;  // ERROR: no `hello_crate` in `far::outer`
            use super::hello_far;    // ERROR: no `hello_far` in `far::outer`
            use super::hello_outer;  // ERROR: no `hello_outer` in `far::outer`
        }
    }
}

Is this working as intended? To me, it seems that pub(crate) use, pub(super) use, and more generally pub(some::path) use all currently have the same meaning---which feels odd.

petrochenkov commented 7 years ago

@ExpHP You have to add #![feature(item_like_imports)] as well.

ExpHP commented 7 years ago

Ah, I see. I notice said feature also reexports unmarked use sub::module::item (consistent with the RFC's statements that no pub is the same as pub(self)), so I see now how it could possibly be a contentious feature...

(...not that it bothers me! I rather wish it were that way all along! :stuck_out_tongue:)

Edit: Silly me, this has nothing to do with the feature being contentious; it's actually just this RFC

leoyvens commented 7 years ago

Would (pub restricted) be unambiguous?

pnkfelix commented 7 years ago

Some proposals were made for alternative syntax options there

After further discussion with @nikomatsakis, we are wondering whether pub{restricted} (instead of pub(restricted)) is all that bad...

It seems like the shortest path to something that would remain as palatable as the status quo while fixing the struct Tuple(pub (TYPE)) problem.

nikomatsakis commented 7 years ago

I am in favor of pub{restricted}. Well, mostly I am in favor of stabilizing this syntax form -- and that seems the most straightforward and uncomplicated syntax for achieving it. =)

cc @rust-lang/lang -- what does everyone think about the alternative syntax pub{restricted} (note: braces) instead of pub(restricted)?

eddyb commented 7 years ago

I like the prospect of potentially expanding the feature to sibling ("friend") modules, which would match multiple import use syntax if we have braces, so for that reason I'd prefer them over parens.

sbstp commented 7 years ago

If pub{restricted} can help stabilize this, then ok. I'd like to see short cuts for the most common patterns, like crate fn to mean pub{crate} fn. I feel like most people won't bother with finer grained restrictions and use pub{crate/super} most of the time, at least I would. Perhaps a contextual keyword could be used, like in default's case.

nrc commented 7 years ago

I've probably been spending too much time on rustfmt, but pub{foo} looks very bad to me - I think it is rare to use braces like this? We mostly use them for block-like things, the only exception being in use expressions, where they are always preceded by :: (and where, thinking about, I'd prefer to use parens anyway). I'd prefer using [] or <> instead of {} - I don't think any is really appropriate, but at least they suggest block formatting less.

cuviper commented 7 years ago

The {} braces as found in use are similar to Bash brace expansion, where you can read it as repeating the prefix for each item in the list. (use foo::{bar,baz}; -> use foo::bar; use foo::baz;) That repetition doesn't exactly fit the pub idea as well, especially since the braces aren't optional with a single visibility spec.

I like pub[restricted,crate,etc] a little better, because it looks more like an attribute list. Or maybe it should be an actual attribute? I only found that briefly considered in https://github.com/rust-lang/rfcs/pull/1422#issuecomment-169582057.

aturon commented 7 years ago

I spent some time this morning revisiting the initial RFC thread and reading over this one. Like others on the lang team, I'm eager to make a decision about the syntax and get this stabilized. (I personally don't think a new RFC is required to do this; we frequently make surface adjustments like this during the stabilization process.)

A couple of thoughts:

eddyb commented 7 years ago

I don't think pub[m] works any better, as [T] is a valid type and IIRC the problem was with tuple structs which have a visibility just before each field type.

sbstp commented 7 years ago

I like the idea of simplifying it to pub[crate] and pub[super]. But if those are the only options, then is the square bracket notation needed at all? I think keywords could be better. The simplest way would be to add a crate and a super visibility modifier:

fn do_something() {
    // ...
}
pub fn do_something() {
    // ...
}
crate fn do_something() {
    // ...
}
super fn do_something() {
    // ...
}
aturon commented 7 years ago

@eddyb ah yes, of course you're right. Drat.

withoutboats commented 7 years ago

I advocated for limiting this feature to crate and super during the RFC and still think that's a pretty reasonable approach. Since the RFC I have still never wanted anything else, but I am usually in crates of less than 10k lines of code.

nikomatsakis commented 7 years ago

I am still not wild about supporting only crate and super. I agree those will be the most frequent cases, but I definitely find that when I am writing code, I often have top-level modules that form "mini-crates" of their own:

mod ui {...}
mod solve {...}

Most of the time, those top-level modules have only a single level, and hence when importing from cousins I will use super. This would be a good fit:

mod ui {
    mod button {
        use super::form::Form;
    }
    mod form {
        ...
    }
}

But I often find that at least one such module gets complex enough to warrant some internal structure. I personally find super::super pretty hard for me to think about and hence at that point I tend to switch to absolute paths:

mod ui {
    mod button {
        mod draw {
            use ui::form::Form;
        }
        mod other { ... }
    }
    mod form {
        ...
    }
}

I imagine this same pattern will repeat, but I will almost certainly not want to write pub(super::super), because that is very long. Writing pub(ui) would be nice. But in practice I guess I would either be forced to keep module hierarchies shallower than I might have otherwise done, or else to use pub(crate).

In a way, I feel like I might be happier with just pub(crate) than with pub(crate) and pub(super), since at least then the limits are clear up front, versus being something that you only encounter when you try to break-up existing code into a more hierarchical structure. =)

EDIT: And yes then I would favor just writing crate and not pub(crate).

withoutboats commented 7 years ago

I think what I do in those situations is that I essentially repeat the "subcrate" pattern fractally, so that each mod file is a flat facade of its children which it presents to its siblings & parent.

I also would probably be happy with just using crate to mean pub(crate) and no other fanciness in this feature. But again, I'm not dealing with as large crates as other users.

Nemo157 commented 7 years ago

Coming from a C# background (and I believe it's the same in Java) having just pub(crate) makes a lot of sense, it's equivalent to internal visibility. I really can't think of any time where I have wanted to be able to limit visibility more than that, again though I rarely work on high LOC projects (and the ones I have worked on just had everything public and relied on code reviews + decent developers to not make a mess).

jimmycuadra commented 7 years ago

If this is stabilized only allowing pub(crate) and pub(super), does that preclude allowing arbitrary paths in the future? Maybe it would make sense to stabilize a more conservative part of this feature set that everyone is desired and future-compatible (like conservative impl trait), since it sounds like there is still concern from many people that allowing arbitrary paths might not be useful enough to warrant its baggage.

nikomatsakis commented 7 years ago

I think what I do in those situations is that I essentially repeat the "subcrate" pattern fractally, so that each mod file is a flat facade of its children which it presents to its siblings & parent.

That doesn't really work. For example, consider this struct Draw declared inside ui::buttom::draw. I want Draw to be exposed to all of the ui module. What pub should I use when I declare it?

mod ui {
    mod button {
        mod draw {
            pub(X) struct Draw; // <-- what to write here for X?
        }
        mod other { ... }

        pub(super) use self::draw::Draw; // <-- `Draw` is exposed to `ui` as `button::Draw`
    }
    mod form {
        ...
    }
}

If I use pub, it is broader than I wanted. Same with pub(crate). But pub(super) is incorrect: that would make it private to ui::button, and I wanted ui. pub(super::super) would work, but that is the thing I am saying I find unwieldy. I would rather write pub(ui).

But I realize that I can't have what I want if we want to solve the parsing ambiguity (unless we adopt pub{ui}, and I agree it's kind of ugly). Maybe it's worth it to just have the keyword paths like pub(crate) and pub(super) -- but I think we should probably add pub(self) too, in that case, as it is gives us a mechanism to add "private" declarations in some places where we currently lack that ability. =)

withoutboats commented 7 years ago

@nikomatsakis It seems like your concerned about preventing other pub use statements from re-exporting Draw further afield than you'd like. I guess my answer is that I am just not very concerned about that.

Without any restrictions (just raw pub) your example is already exposing Draw only to ui. The concern seems to be that someone in ui does pub use self::button::Draw; when they shouldn't have. Is that what you see this feature mainly as preventing?

If it is, to me it seems more likely that someone will do that because they've decided they need to use Draw somewhere outside of ui, and so they remove your restrictions. That is, these roadblocks don't seem likely to stop someone from doing what you fear because they would only do that if they feel exposing it is well-motivated and that your decision to restrict it was wrong.

withoutboats commented 7 years ago

The point of this feature seemed to me to be more about convenience - if I'm not feeling too attentive to visibility but I know I don't want it in my public API I can just use a crate visibility and call it a day.

sbstp commented 7 years ago

Could a different keyword solve the ambiguity? For instance, pub stays as it is currently, and a vis keyword is added, which always contains a path/modifier.

pub fn()
vis(pub) fn()
vis(self) fn()
vis(crate) fn()
vis(some::path) fn()
Nemo157 commented 7 years ago

Also maybe relevant, @withoutboats post on the Rust module system being too confusing (forum thread). If users are finding the module system itself too confusing, what are they going to think about a visibility system built on top of the module system.

nikomatsakis commented 7 years ago

It seems like your concerned about preventing other pub use statements from re-exporting Draw further afield than you'd like. I guess my answer is that I am just not very concerned about that.

Fair enough. =) It is indeed the model I was shooting for.

So, let's posit that we are going to simplify to super/crate/self. If all we're looking for is a convinence that lets you be sure you don't contaminate your public API, is it worth considering just crate fn foo() as an alternate privacy level (public-crate-private), basically. Like a better version of Java's "package", I guess.

Pro: Short and sweet. Con: pub(super) is also a fairly common desire. Con: Makes crate-level special, which is sometimes seen as a suspicious sign.

Somewhat unrelated, but I like the idea of writing crate impl Foo as a way to add crate-local inherent methods (which could then be added to types outside your control). I want this all the time. (Though it also raises the question of whether one could have crate-private impls of traits, which raises yet more thorny issues.)

It would also mean that (in specialization) we couldn't have default(crate), or at least it wouldn't have the natural correspondence to pub(crate).

eddyb commented 7 years ago

I'm not seeing pub in(path) mentioned as an alternative. It has the beautiful property that it requires no lookahead inside a paren to distinguish it from anything else (assuming we're fine with never having types that start with in) and it's a bit more self-descriptive than everything else. And if we want the crate shorthand we could stabilize that first (but, uhm, crate extern crate would parse, wouldn't it?) and try to make sue pub in(path) isn't stabilized without tested macro interactions.

nikomatsakis commented 7 years ago

@eddyb I considered it at some point I think, but it seems kind of long to me, and I find the space a bit hard to parse (hard to see that the in is associated with the pub).

pub struct Foo {
    pub in(ui) f: Bar
}

vs

pub struct Foo {
    pub(ui) f: Bar
}

Just for reference sake, here it is with crate:

pub struct Foo {
    pub in(crate) f: Bar
}

vs

pub struct Foo {
    pub(crate) f: Bar
}

vs

pub struct Foo {
    crate f: Bar
}

And ... yeah ... crate extern crate would parse. All the more reason to deprecate extern crate, but that's another discussion, isn't it. :)

nrc commented 7 years ago

@nikomatsakis I think another con for just crate is that it has no obvious relation to visibility (although maybe if it were the only reason to use crate, that would be different).

It would also mean that (in specialization) we couldn't have default(crate), or at least it wouldn't have the natural correspondence to pub(crate).

I think it is worth thinking of a solution to both these problems at the same time - seems that visibility and specialisability (and maybe overridability one day, depending on where we go with inheritance) are all properties that should be defined in the same way.

re pub in(foo) I also dislike the space, but it would work for default and reads kind of nicely. I think I prefer pub(foo)/pub[foo]/pub<foo> more than either, but pub in(foo) better than crate.

cuviper commented 7 years ago

Would it be horrible to use @ for this? This could be with some sort of optional bracketing, only required with multiple paths or for disambiguation.

nikomatsakis commented 7 years ago

@withoutboats ah, I remember now another use-case for this feature. Often I have a type that IS publicly exposed to the world, but where I would like to expose some of its fields to some other modules. So in my ui example that might be like:

mod ui {
    mod button {
        mod defn {
            pub struct Button {
                pub(ui) id: u32 // ui widgets get to play with this field, but not others
            }
        }
        pub use self::defn::Button;
    }
    pub use self::button::Button;
}

@cuviper I think that is ambiguous too. Consider struct Foo(pub@foo::bar::baz) -- is that struct Foo(pub@foo (::bar::baz))? Or struct Foo(pub@foo::bar (::baz))?

pnkfelix commented 7 years ago

@withoutboats wrote:

That is, these roadblocks don't seem likely to stop someone from doing what you fear because they would only do that if they feel exposing it is well-motivated and that your decision to restrict it was wrong.

Its fine with me if someone decides that I was wrong and revises my definition to increase its visibility.

Why is that fine? Because someone who comes along the code later and sees pub item-defn will say "ah, at first blush, I will assume that this item is exposes to the universe at large." (And likewise, someone who sees pub(crate) item-defn will likewise make a similar conclusion except the exposure is to the crate at large.)

In other words, one important benefit I see of using pub(restricted) is that someone who reads the code can quickly locally reason about how exposed the item is. (Later they might employ more global reasoning to narrow down the uses.)

pnkfelix commented 7 years ago

In response to the side thread between @eddyb and @nikomatsakis about pub in(path): I too find the space between pub and in to be hard to visually parse.

However, if we assume that the common case here is going to be crate and super, perhaps we can do this: allow both pub(crate) and pub(super), but for arbitrary paths, one writes pub(in path).

This way, the common cases remain succinct, and you need to use in to drop into parsing a more complex path.

nikomatsakis commented 7 years ago

@pnkfelix

However, if assume that the common case here is going to be crate and super, perhaps we can do this: allow both pub(crate) and pub(super), but for arbitrary paths, one writes pub(in path).

I... don't hate this.

eddyb commented 7 years ago

I agree that is a reasonable use of the in keyword and should solve the macro fragment problem.

petrochenkov commented 7 years ago

limiting this feature to crate and super

Even if only pub(crate), pub(super) and pub(self) are left on the surface, compiler will still need to support pub(path) internally because private items in module a::b are desugared into pub(a::b) and pub(super) is desugared into pub(concrete::module) as well. Module-based visibilities is a really nice system that can be orthogonally applied to all kinds of access rights - sealed(path), default(path), mut(path) and maybe something else in the future! It would be silly to restrict it today to a few hard-coded modules due to a minor parsing problem.

petrochenkov commented 7 years ago

pub(in path)

Ugh. Even pub{path} looks better than this, despite looking bad enough on the absolute scale.

allow both pub(crate) and pub(super), but for arbitrary paths, one writes pub(in path)

Note that super currently is an "arbitrary path" too, you need one more special case to make this work.

petrochenkov commented 7 years ago

@nikomatsakis

Consider struct Foo(pub@foo::bar::baz) -- is that struct Foo(pub@foo (::bar::baz))? Or struct Foo(pub@foo::bar (::baz))?

What is the problem in disambiguating by always parsing paths greedily? (e.g. struct Foo(pub@foo::bar::baz) is a syntactic error due to missing type)

(I'm still a fan of https://github.com/rust-lang/rust/pull/33100, btw, that is similar to greedily parsing pub@path ideologically, but less backward compatible)

petrochenkov commented 7 years ago

@sbstp suggests:

Could a different keyword solve the ambiguity? For instance, pub stays as it is currently, and a vis keyword is added, which always contains a path/modifier.

pub fn()
crate fn() // <-- my addition, possible sugar for `vis(crate) fn()`
vis(pub) fn()
vis(self) fn()
vis(crate) fn()
vis(some::path) fn()

At least one reason to like this suggestion is that it gives a common syntactic form for all visibilities (may be useful for macros?) which is also extensible to uses of visibilities in other contexts:

default(pub) fn() // <-- this is supposedly the default, but now there's a way to write it explicitly! 
default(self) fn()
default(crate) fn()
default(some::path) fn()

EDIT: on the other hand, pub(pub) can be easily introduced as well.

An obvious (but not sure if strong) reason to dislike is a new context-sensitive identifier ("weak keyword"). I'm... not even sure it can break anything? vis(path) is a valid type syntactically (a trait object type with type parameters in parens, like Fn(u8)), so the "tuple struct problem" takes place, but vis needs to be a Fn-like trait for the code to be valid, which is unlikely (impossible?).

EDIT: vis can't be applied fully backward compatibly to items in blocks:

fn main() {
    vis(a::b::c) struct S; // <-- visibility
    vis(a::b::c::d); // <-- fn call
}
withoutboats commented 7 years ago

Because someone who comes along the code later and sees pub item-defn will say "ah, at first blush, I will assume that this item is exposes to the universe at large." (And likewise, someone who sees pub(crate) item-defn will likewise make a similar conclusion except the exposure is to the crate at large.)

Right now I don't assume that, because without other pub mod or pub use declarations, it isn't true (and this feature doesn't exist right now). To me it seems like an unfortunate outcome ifits normal for every exported item to have a scoping qualifier on its declaration, because I feel that it would introduce a lot of noise.

I think of this as largely a convenience for complex facading, not something that would be used all the time.