rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
96.98k stars 12.54k forks source link

Stabilize uniform paths on Rust 2018 #55618

Closed withoutboats closed 5 years ago

withoutboats commented 5 years ago

@rfcbot fcp merge cc #53130

This issue is to track the discussion of the following proposal:

I propose we stabilize the uniform_paths feature in Rust 2018, eliminating the anchored_paths variant. I further propose we backport this stabilization to version 1.31 beta, so that all stable versions of Rust 2018 have this feature.

We've taken informal polls about this question in a paper document. Most of the lang team has favored uniform_paths, I and @cramertj were the only two members who initially favored anchored_paths. Yesterday, we discussed this in the lang team video meeting; in the meeting we sort of crystallized the various pros and cons of the two variants, but didn't reach a consensus.

In the time since the meeting, I've come around to thinking that stabilizing on uniform_paths is probably the best choice. I imagine @cramertj has not changed his mind, and though he said in the meeting he wouldn't block a decision, I hope we can use this FCP period to see if there's any agreeable changes that would make uniform_paths more palatable to him.

I've also written a blog post summarizing the meeting discussion and how my position has evolved, which provides more context on this decision.

I also want to make sure we continue to get community feedback as we reach this final decision! Please contribute new thoughts you have to the discussion. However, we will hopefully make this decision within the next few weeks, and certain proposals are out of scope. Any third proposal which is not completely backwards compatible with the current behavior of paths in Rust 2018 is impossible to ship in Rust 2018 at this point. Variations on anchored or uniform paths that are backwards compatible may be considered, but that also seems unlikely.

withoutboats commented 5 years ago

@rfcbot fcp merge

rfcbot commented 5 years ago

Team member @withoutboats has proposed to merge this. The next step is review by the rest of the tagged teams:

Concerns:

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

joshtriplett commented 5 years ago

@withoutboats Thank you for your detailed writeup, as well as your careful and considered evaluation.

Having been the sole opposition of more than one language proposal in the past, I would like to explicitly extend the following invitation to @cramertj: If you feel you need more time to write up some thoughts on how uniform_paths could adapt to better meet the needs of Rust users in your group and elsewhere at Google, as well as any concerns about uniform_paths that didn't make it into yesterday's discussion as captured by @withoutboats' writeup, please feel free to either register a concern or simply state that you'd like more time. This is not a vote, and despite rfcbot's default behavior for what it takes to move forward with an FCP, I'd like us to reach a full consensus if possible.

nikomatsakis commented 5 years ago

please feel free to either register a concern or simply state that you'd like more time

It's good to say it explicitly, of course, but -- to be clear -- I feel this is always true for any FCP =)

Centril commented 5 years ago

Excellent writeup @withoutboats. ❤️

Report & Tests

@rfcbot concern stabilization-report-and-tests

Before we stabilize uniform_paths, I think we need a stabilization report that documents what behavior is being stabilized (and what is not), and tests for that behavior. Since @petrochenkov and @eddyb have been at the forefront of implementing the behavior, maybe they could write up the report?

File-centric behavior

@rfcbot concern file-centric-behavior

As for what we're going to stabilize here, @nikomatsakis noted in the paper document that a more file-centric adaptation of uniform_paths is favourable such that you may write:

use std::sync::Arc;

mod foo {
    mod bar {
        // uses the `use` from parent module, because it is in the same file
        fn baz() -> Arc<u32> { ... }
    }
}

Benefits

The file-centric approach is beneficial mainly because it is quite common for languages to take such an approach and Rust is the odd fish in the bunch; By adopting an approach where the above snippet is legal, the speed-bump when learning Rust can be reduced. For example, in Java, inner classes can see the imports at the top of the file.

Another benefit of the file-centric approach is that you no longer have to write the use super::*; boilerplate, which is arguably a misfeature, anymore. This makes writing unit tests more ergonomic:

use some_crate::Thing;

fn foo(x: Bar) -> Baz { ... }

#[cfg(test)]
mod test {
    // use super::*;
    // ^-- you don't have to do this anymore.

    ...
}

Drawbacks

I see two principal drawbacks with the file-centric behavior:

  1. It encourages writing large files and thus may diminish the maintainability of large projects.

  2. Inline modules are no longer units that can be so easily refactored. For example, if you want to move an inline mod foo { ... } into a dedicated file, then you may need to scroll up to the beginning of the file and copy the use statements. This can be mitigated with IDEs -- but not everyone have those.

One mitigating factor here is that it is unlikely to have inline modules which nest; the most common use of inline modules are for unit tests or for small modules that contain platform specific behavior -- in the language team meeting, everyone agreed that this was the case and that nested inline modules were not commonly used.

Conclusion

All in all, while I was initially somewhat skeptical about the file-centric behavior, I think it is in line with one of the core goals of uniform_paths -- to make the module system intuitive and easy to learn. Therefore, I'd like to propose that we incorporate the file-centric behavior into the stabilization (EDIT: or at least make it possible after stabilization...).

Why does it have to happen now? Consider the following (playground):

// Nightly, Edition 2018 on.

#![feature(uniform_paths)]

// Let's assume that there's a crate `foo` in the extern prelude with `foobar` in it.

mod foo {
    pub fn foobar() -> u8 { 1 }
}

mod bar {
    fn barfoo() -> u8 {
        // With the current `uniform_paths`,
        // there's no conflict as far as the resolution system sees it.
        //
        // Thus, if `foo` is in the extern prelude, it will resolve to the
        // `foo` crate and not `foo` the sibling module. If we allow that
        // to pass, then a *breaking change* will be the result if we change
        // to the file centric behavior.
        foo::foobar()
    }
}
lucozadeez commented 5 years ago

Apologies if this has already been considered but would extern::foo work to disambiguate uniform paths?

So the logic would always be

It has the advantage of using an existing keyword. I couldn't see any reason for there to be a parsing ambiguity but I could be wrong there.

Centril commented 5 years ago

@lucozadeez

Apologies if this has already been considered but would extern::foo work to disambiguate uniform paths?

To disambiguate, you can write ::foo instead; this will have the same effect. We could also permit extern::foo but that would be sort of redundant (more than one ways to do it without clear benefits...).

I couldn't see any reason for there to be a parsing ambiguity but I could be wrong there.

I don't think there would be any ambiguity.

joshtriplett commented 5 years ago

@Centril I feel that the "file-centric approach" is something that could be introduced as an extension later, with sufficient care. And as @withoutboats put it, "Any third proposal which is not completely backwards compatible with the current behavior of paths in Rust 2018 is impossible to ship in Rust 2018 at this point. Variations on anchored or uniform paths that are backwards compatible may be considered, but that also seems unlikely."

As such, please consider whether the "file-centric approach" could be written as a compatible add-on.

nikomatsakis commented 5 years ago

@Centril

Before we stabilize uniform_paths, I think we need a stabilization report that documents what behavior is being stabilized (and what is not)

😍 Yes please! 😍

petrochenkov commented 5 years ago

I think we need a stabilization report that documents what behavior is being stabilized (and what is not), and tests for that behavior

Implementation is in progress, so the behavior will change slightly. I landed some preliminary refactorings last weekend and hope to finish the work this weekend. I want to do it in such way that uniform paths are always enabled on 2018, but gated if actually used, so we get feature gate errors instead of "unresolved name" errors. This is possible because "anchored paths with future-proofing" is a strict subset of uniform paths.

I'd prefer to test the new implementation for one more cycle and not backport stabilization to 1.31.

nikomatsakis commented 5 years ago

So @Centril points out that extending to support "all enclosing scopes within the file" would not, strictly speaking, be backwards compatible (without some sort of fallback or prioritization, I guess). For example this would compile today but yield ambiguity tomorrow:

mod rayon { ... }

mod foo {
    use rayon::join; // today, compiles to extern crate `rayon`
}

Similarly, this builds today:

#![feature(uniform_paths)]

mod bar {
    pub fn x() { }
}

fn main() {
    mod bar { pub fn y() { } }

    use bar::x;

    x();
}

Regardless, I'm not inclined to block much longer on this stuff. Que sera sera. Gotta ship someday. =)

nikomatsakis commented 5 years ago

Good point and I suppose this is the cause for their "concern".

Centril commented 5 years ago

I would be OK with skipping the uniform_paths_file_centric adaptation and just go with uniform_paths. But I am concerned about "que sera sera" when we know the better approach (if we can agree it's better...) and it isn't a pipe-dream... It would be one thing to avoid shipping if there was a hypothetical better solution, but it isn't hypothetical here. My main concern with doing changes in 2021 towards the file-centric approach would be technical debt; I'd like to avoid that technical debt even if we have to wait 1/2 more months. I suppose we could do a forward-compat warning and then just do it inside Rust 2018 as well but that is sort of breaking our stability promises...

It seems to me that instituting changes to uniform_paths that ensures forward-compatibility with _file_centric may be as complex as implementing _file_centric itself...? Maybe @petrochenkov can talk a bit about that?

petrochenkov commented 5 years ago

uniform_paths_file_centric

So, my suggestion would be to pursue this in some opt-in form after the edition is released. Then if it's implemented, 2-3 years of usage experience should be enough to decide whether it should be enabled by default in some cases in Rust 2021 or not.

Centril commented 5 years ago
  • Second, I don't think decisions like this (or implementation work like this) need to be done in the last moment.

So I take it you don't think it can be feasibly implemented in the short period remaining until Edition 2018 ships?

  • Third, it's always possible to do backward-compatibly with some opt-in syntax #[transparent] mod m { ... } or something. (Pro: can work on out-of-line modules as well.)

I don't think that would be worth it and it would just be even more technical debt; the idea is that it should be intuitive by default... using that attribute wouldn't be.

  • Fourth, some opt-out syntax is probably needed anyway. What if you actually need the isolation provided by modules? Moving code into a separate file is not always possible or convenient (e.g. what if the module is generated by a macro).

This makes a lot of sense, so it seems like the file-centric idea might need some design work + process given that.

So, my suggestion would be to pursue this in some opt-in form after the edition is released. Then if it's implemented, 2-3 years of usage experience should be enough to decide whether it should be enabled by default in some cases in Rust 2021 or not.

I suppose that the file-centric approach isn't very actionable right now and neither are the forward-compatibility hacks to make it possible later...

I think we might just have to either use forward-compat warnings later (and implement them mid-edition-2018), or do it in Rust 2021 (tho the technical debt isn't great...). Therefore... @rfcbot resolve file-centric-behavior

mark-i-m commented 5 years ago

@withoutboats You're blog post is really helpful. Thanks!


Personally, I've been using edition 2018 beta/nightly for while now, and I haven't really run into self:: or ::foo, so I don't have strong feelings. Overall, my inclination is still towards anchored paths because use statements are easier to mentally parse when reading code, as @withoutboats put it:

The use statements of anchored paths are syntactically disambiguated as to where they come from: either they come from a special namespace crate, super, and self, or it comes from an extern crate. This makes it easy to make sure your use statements are grouped together properly during code review without referencing other information.

djc commented 5 years ago

After reading @withoutboats post, I wanted to make sure the following compromise has been considered and rejected: to do uniform paths, but always requiring the leading :: for extern crate paths.

By disambiguating-by-default, the two disadvantages listed in the blog post get neutralized:

At the expense of undoing the initial advantage that :: would no longer be needed in non-use paths.

This seems like an elegant compromise to me after reading the blog post, but I fully realize it probably has already been discussed to death. Just wanted to throw this out there and maybe allow someone from the lang team to articulate why this alternative is less attractive than uniform-paths as proposed.

mark-i-m commented 5 years ago

I would personally really not like to write or read :: everywhere. My code would end up littered with them. I would rather see self:: in a few places than :: everywhere I use an external crate. I guess that's a matter of taste, though.

TimNN commented 5 years ago

Could someone please remind me how the following would work for uniform and anchored paths?

// Assume there is also an
// extern crate foo;
mod foo;

fn main() {
    // Does this call the module or the crate? (Or is it an error?)
    foo::bar();
    // How would I call the other?
}

Edit: Also, would use self::* be allowed with uniform paths? Or would that syntax be forbidden?

ghost commented 5 years ago

Whichever path design you choose, I'd wish everybody wrote use self::foo::Foo; rather than use foo::Foo; because then it's immediately clear to someone reading the code that foo is not an external crate but probably a submodule given the letter casing. And like the old adage says, code is written once and read many times, so it should be optimized for ease of reading rather than writing. I also think that the common case is when you have multiple imports from the current crate, in which case you wouldn't even be writing use self::foo::Foo; but rather there would be a whole section of imports from the current crate inside which the Foo import would go, and this would be the best option for reading comprehension:

use crate::{
    thismod::foo::Foo,
    // ...
};

But if you choose the uniform paths design, then there will probably be some who write their imports using absolute paths and some using relative paths, and it's the readers of the code who suffer from this mishmash of styles. So, I would definitely prefer you choose the anchored paths design, but I'd make one change, or addition to it: I'd allow you to start the path in a use declaration with the name of an enum type that is currently in scope. This would provide a nicer syntax for importing enum variants (typically in function local scope) by simply: use Bar::*;. And the reason why this doesn't hurt readability is the fact that enum names tend to start with an upper case letter, whereas module and crate names typically start with a lower case letter. That is a good enough hint for the reader that Bar is most likely an enum that's currently in scope.

ghost commented 5 years ago

@TimNN

Could someone please remind me how the following would work for uniform and anchored paths?

// Assume there is also an
// extern crate foo;
mod foo;

fn main() {
    // Does this call the module or the crate? (Or is it an error?)
    foo::bar();
    // How would I call the other?
}

I just tried that out, and in both uniform and anchored paths designs, your code calls the module function. You can unambiguously call the module one with crate::foo::bar() or the extern crate one with ::foo::bar().

But what's more interesting is that this is a really bad behaviour. The act of adding a module should not silently hijack a function coming from an external crate. I think that is one of the main design principles of how imports are designed in the D language.

EDIT: I just re-read about how imports work in D language (which I had mostly forgotten about). The main design principle of imports in D says that: "Adding or removing imports can't change the decision of resolving a function name". Where "resolving a function name" means: "figuring out which function to call when the programmer writes foo()". And in D, importing a module brings all the symbols in that module into the current module's scope, after which there's a specific way in which the symbols coming from all the imported modules fight each other over which one of them is the best match for the call to foo().

This is so different from how things work in Rust, that I seem to be unable to compare the two, probably due to not knowing exactly how this stuff works in either language. But in Rust specifically, an external crate's name seems to get hidden by any other symbol in scope, including symbols coming from glob imports. For example, given the Cargo dependency num = "0.2.0", the following code prints "x: 3", and if you comment out the module num, then the code prints "x: 2", and if you then also comment out the module sub along with use crate::sub::*;, then the code prints "x: 1".

use crate::sub::*;

mod sub {
    // This, whether from a glob import or regular import, hides the extern crate `num`
    pub mod num {
        pub fn one() -> u64 {
            2
        }
    }
}

// This hides both the extern crate `num` and `crate::sub::num` coming from glob import
mod num {
    pub fn one() -> u64 {
        3
    }
}

fn main() {
    let x: u64 = num::one();
    println!("x: {:?}", x);
}

I realize that this kind of hiding behaviour is not the same thing as the "function hijacking" which D language's import rules are designed to prevent: a function that's imported from one module hijacking another function with the same name that's imported from a different module. That kind of hijacking is not possible in Rust either. In the example above, it's just a matter of a "more local" symbol hiding a "less local" or "more external" symbol. I think this is fine, and in fact D language does that kind of thing as well - an imported symbol gets hidden by a module local symbol.

steveklabnik commented 5 years ago

I would be shocked if the “file centric approach” was accepted. This is a major new thing that would be stabilized almost immediately without RFC-like discussion. I don’t like to speak too strongly but I’m shocked it was even suggested.

In terms of “anchored” vs “uniform”, I still prefer anchored, for basically all of these reasons: https://www.reddit.com/r/rust/comments/9toa5j/comment/e8xxs2p

I have a bit more to say but I’m about to catch a flight; in the end, I’m not willing to die on the hill of anchored paths, but do prefer them quite strongly.

kornelski commented 5 years ago

In the discussions submodules and disambiguation are often taken into account, but I'm not seeing much consideration of workspaces, so here's my use case.

In projects I write I use workspaces a lot. I have a natural progression for chunks of code:

  1. In its own module in the same crate, until it outgrows it
  2. In its own crate in the same workspace, unless its useful outside the project
  3. As a separate crate on crates.io

For example, see how many "internal" crates crates.rs has: https://gitlab.com/crates.rs

Currently, refactoring this way is actually convenient as there is no syntactic difference between modules and crates. I can replace mod foo with extern crate foo and it works beautifully (I generally don't nest modules, so implied extern would work fine too).

AFAIK anchored paths variant intentionally removes that ability, which is why I don't like it. Not only I'm not interested in marking whether something is a module or extern crate, I deliberately don't want that distinction and take advantage of modules and crates having uniform syntax.

withoutboats commented 5 years ago

In terms of “anchored” vs “uniform”, I still prefer anchored, for basically all of these reasons: https://www.reddit.com/r/rust/comments/9toa5j/comment/e8xxs2p

Thanks for linking to this comment, I think it succinctly sums up the most common arguments in favor of anchored paths. But I must say, this summary reveals what seems to me to be a flaw in understanding about how paths work in the anchored paths proposal (or indeed, in any version of Rust that exists): the post claims quite emphatically that the appeal is that you "ALWAYS" anchor paths, but this is not true: you only anchor paths in use statements, all other paths use the mechanism that uniform paths propose to apply to use statements as well.

All but the last argument in this comment (which is a claim about Rust's philosophy I don't agree with) seem to fall apart when you consider that even under anchored paths, paths sometimes have the uniform path semantics. You are still subject to the same refactoring hazards, you still have to understand the contextual information, its just not true in the case of use statements.

This is why uniform paths is called uniform paths: it makes this semantics true everywhere, instead of having a separate logic that only applies to one item form.

ghost commented 5 years ago

@kornelski

Currently, refactoring this way is actually convenient as there is no syntactic difference between modules and crates. I can replace mod foo with extern crate foo and it works beautifully (I generally don't nest modules, so implied extern would work fine too).

AFAIK anchored paths variant intentionally removes that ability, which is why I don't like it. Not only I'm not interested in marking whether something is a module or extern crate, I deliberately don't want that distinction and take advantage of modules and crates having uniform syntax.

Both of the new path variants remove that ability inside modules. The uniform paths variant retains that ability only in the current crate's root, which isn't usually where most of the code is.

withoutboats commented 5 years ago

Apologies if this has already been considered but would extern::foo work to disambiguate uniform paths?

This works fine and is even preferable in my opinion; since its only used for disambiguating, being kinda long isnt the problem, and use extern::foo is very clear in its disambiguation.

However, I think we're currently using :: because its consistent with the behavior on 2015. Basically I think the most likely path toward using extern:: is to introduce extern:: as a synonym for ::, later deprecate :: (at least outside of 2015), and possibly someday remove :: in an edition update.

I wanted to make sure the following compromise has been considered and rejected: to do uniform paths, but always requiring the leading :: for extern crate paths.

I'm fairly certain we've discussed this and ultimately rejected it. I see this as the third of the three potentially viable variants on the 2018 path changes, each of which have two of these three desirable properties:

We've already decided that the third bullet is desirable, and its been a question between the other two bullets, which is what anchored and uniform represent.


More broadly, I want to point out that while having ambiguity might sound troubling, the reality is that name resolution is already full of ambiguity, for example, glob imports and the prelude introduce potential name conflicts which are partially solved through shadowing rather than disambiguation.

matklad commented 5 years ago

But I must say, this summary reveals what seems to me to be a flaw in understanding about how paths work in the anchored paths proposal (or indeed, in any version of Rust that exists)

There's a significant difference in 1.30 with respect to the ambiguity argument: we've stabilized extern crates in prelude. In previous versions of rust, we didn't have extern crate/lexical scope ambiguities.

Question: under uniform path, what are the differences between use-paths and paths elsewhere? Or do they behave exactly the same?

petrochenkov commented 5 years ago

@matklad

Question: under uniform path, what are the differences between use-paths and paths elsewhere? Or do they behave exactly the same?

Ideally, the behavior is exactly the same if the code compiles. However, code accepted in non-use paths may fail in use paths due to additional restrictions caused by use paths being resolved early, when module structure of the crate is still in flux. (Same restrictions apply to macro paths as well.)

In reality, the current implementation ignores all kinds of preludes and also macro_rules items. The reimplementation in progress fixes that, but still leaves two caveats: imports never see local variables (let x) and generic parameters (T in fn f<T>() { ... }) - we just don't have necessary infrastructure to do that. Perhaps I'll come up with some temporary future-proofing hack after the reimplementation lands. This issue exists for macro paths as well (even on stable, since 1.30):

mod T { // T1
    pub macro mac() {}
}
fn f<T>() { // T2
    T::mac!(); // Should be an error because T2 shadows T1, but it's not.
}
ghost commented 5 years ago

I see the anchored paths variant as a way of enforcing the best possible style of writing imports, rather than relying on the style guide to do that. In fact, what will the style guide say about imports if the uniform paths variant is chosen? Will it say that it's just fine to start the path in a use declaration with a local symbol, or that one should prefer (in order to make the imports easier to understand) to start the import path with one of those words which the anchored paths variant forces you to start it with? If the uniform paths variant is chosen, you just know that there are going to be people who write use crate::*; in each of their modules just to be able to keep using the old way of writing imports, like this:

mod foo {
    pub struct Foo;
}

mod bar {
    pub struct Bar;
}

mod baz {
    // This declaration...
    use crate::*;

    // ...enables these:
    use foo::Foo;
    use bar::Bar;

    pub fn baz() {
        let _ = Foo;
        let _ = Bar;
    }
}

fn main() {}
liigo commented 5 years ago

cc me

zrzka commented 5 years ago

Please, please, go with anchored paths. Uniform paths brings ambiguity, which is not good.

For me, it's really not a discussion about how many characters one has to type (IDE, autocompletion, ...), but if it's crystal clear what it does when I read the code. And uniform paths are bad from this point of view. In case of use foo::..., one is forced to always check if there's submodule foo or not.

Also the argument that it helps newbies is not that strong - people don't understand self? Then they wont probably understand more advanced concepts like life cycles, borrow checker, ...

gibfahn commented 5 years ago

After playing with both variants, I think my preference would be:

Then for new projects or for early development I can use the simple use foo, and once something moves towards complexity/stability a combination of clippy+rustfmt+rls will hopefully autogenerate something like:

use ::{
    failure::{ensure, Error},
    toml,
    // ...
};
use crate::{
    mod_1::{Foo, bar},
    mod_2,
    // ...
};

This seems like a case where what you want in the simple case (e.g. quick examples in rust playgrounds) isn't what you want in the complex case (large projects with IDE + linting etc). I'd like to be able to start simply but make things more readable/unambiguous later.

cramertj commented 5 years ago

@rfcbot concern idiomatic_use

I won't leave this concern around for long as I'm not interested in blocking progress here, but I want to make sure we nail down some general idea of what an idiomatic "use" import block looks like-- specifically, how crate-internal imports are distinguished from non-crate-internal imports. Under uniform_paths, what is the idiomatic way to write a set of use imports? Under the current 2018 edition, I usually write something like this:

use {
    crate::xxx,
    super::yyy,
    self::my_local_mod::zzz,
    external_crate_one::aaa,
    external_crate_two::bbb,
};

rustfmt knows to keep self::, crate::, and super:: imports separate from the rest of the list of crates. However, if uniform_paths stabilizes, we no longer require a self:: prefix. This will cause rustfmt to sort the self:: imports in with all of the external crate imports:

use {
    crate::xxx,
    super::yyy,
    external_crate_one::aaa,
    my_local_mod::zzz,
    external_crate_two::bbb.
};

I think we're all on the same page that we don't want that (though I've been wrong before :wink:-- perhaps some are fine with this since it mirrors the ambiguity in non-use paths?). So, what's an idiom-minded Rustacean to do? One option would be to pull all crate-local imports into a separate block, like so:

use {
    crate::xxx,
    super::yyy,
    my_local_mod::zzz,
};
use {
    external_crate_one::aaa,
    external_crate_two::zzz,
};

Another variation on this option would be to leave a blank space line after all crate-local imports. Another option would be to pair the use statement with the definition of the local item being imported from:

use {
    ...
};
...
mod foo;
use foo::bar;
...
enum X { A, B, C };
use X::*;

This makes it abundantly clear where the import is coming from, but leaves imports scattered around the module rather than visible in one chunk at the top.

Once we decide on an idiomatic pattern, we should discuss how we can enforce it by tailoring rustfmt, rustc error messages, clippy, etc. appropriately, and how we make the distinction clear to users unfamiliar with the idiom.

joshtriplett commented 5 years ago

I personally prefer to have one block of external imports (written first), and one block of internal imports (written second).

I also, inconsistently, tend to separate out imports of std/core, but I don't feel strongly about that one.

I also tend to do that with separate use statements rather than top-level braces.

use external_crate_one::aaa;
use external_crate_two::bbb;

use local_module::xxx;
use another_local_module::yyy;
matklad commented 5 years ago

@cramertj a common theme in this discussions is that "X works better because rustfmt sorts imports". I'd like to argue that alphabetical sorting of imports is a bad default, and that we shouldn't base decisions on that feature.

The problem with alphabetical sorting is that it is an arbitrary order which is worse than other arbitrary orders. For example if I import several things as use foo::{X, Y, Z}, if I always just add imports to the end, I get a useful arbitrary order which minimizes diffs. I also sometimes like to roughly group imports by type: functions first, types second, traits last, which is another arbitrary order.

For mod foo; declarations, I group them in "suggested reading order", which is significantly better for readability than alphabetical ordering.

What we really want is PEP-8 style of ordering of groups of imports: std first, extern deps second, you lib the last. rustfmt can't do that, because, in either variant, it can't distinguish std from extern, so you have to add blank line yourself.

The fundamental problem is that organizing import is not a syntax-based formatting problem, its a semantic "organize imports" problem, which includes things like removing unused imports, switching between * and explicit imports, etc. This problem should be solved by RLS, and not by plain rustfmt.

Observation: in languages with great IDE support, like Java or Kotlin, you don't notice imports at all:

My feeling is simple: imports are for the IDE to manage. There are default settings; stick to them, leave your imports folded, use auto-import and "optimize imports on the fly".

https://github.com/yole/kotlin-style-guide/issues/37#issuecomment-284974920

To make it clear, I don't argue for or against a particular path variant or against having a preferred style. I argue against using current rustfmt behavior as an argument in favor of a particular flavor of paths.

EDIT: filed https://github.com/rust-lang-nursery/rustfmt/pull/3173

mark-i-m commented 5 years ago

What we really want is PEP-8 style of ordering of groups of imports: std first, extern deps second, you lib the last. rustfmt can't do that, because, in either variant, it can't distinguish std from extern, so you have to add blank line yourself.

This is what I already do, so I think all of the proposals so far are at least on par with the status quo...

cramertj commented 5 years ago

Observation: in languages with great IDE support, like Java or Kotlin, you don't notice imports at all:

@matklad Yup, this whole conversation (and, I'd dare say, the modules conversation in general) is made vastly more important because we don't have the same type of high-quality goto-def / auto-import resolution that other languages get from IDEs.

However, with that in mind, rustfmt today does reorganize imports, and does shuffle them alphabetically, so we need to either force rustfmt to stop doing this, or we need to come up with style conventions that play well with import shuffling.

Personally, I like having a handy standardized tool that can be used to organize imports-- I'd be fine with that being rustfmt, an IDE, or some other piece of tooling, but I do think we need some standardized way of automatically formatting / organizing imports so I don't have to parse everyone else's personal import-schema.

Centril commented 5 years ago

My preferred style of formatting uses, which I don't apply particularly consistently, but which I aspire to, are similar to @joshtriplett's:

  1. I first have a double-newline separated block for all imports of core (if there are any core imports) then std.

  2. Then I have a block for all imports of an external crate; then a block for the next crate, and so on...

  3. Finally I have internal modules.

  4. I don't tend to use { .. } too much but I also don't eschew it entirely...

Example (not from real code I've written, but what I would like to do...):

// standard library: <-- this comment isn't actually here but for your benefit.
use std::mem::{size_of, align_of};
use std::ptr;

// proptest, an external crate: <-- also not actually here.
use proptest::strategy::{Strategy, BoxedStrategy};
use proptest::collection::vec;
use proptest::arbitrary::{any, Arbitrary};

// rand, an external crate: <-- also not actually here.
use rand::Rng;

// regex-syntax, an external crate: <-- also not actually here.
use regex_syntax::{Parser, Error as ParseError};
use regex_syntax::hir::{
    self, Hir, HirKind::*, Literal::*,
    // I mix the forms somewhat freely here based on what makes sense:
    RepetitionKind::{self, *}, RepetitionRange::*
};

// internal modules: <-- also not actually here.
use collection::SizeRange;
use num::sample_uniform_incl;
use strategy::*;
use test_runner::*;

I also think that a style such as the following can be readable / instructive (this time the comments are present in actual code -- yeah, I write comment heavy code...):

//==============================================================================
// Standard library
//==============================================================================

use std::mem::{size_of, align_of};
use std::ptr;

//==============================================================================
// Proptest
//==============================================================================

use proptest::strategy::{Strategy, BoxedStrategy};
use proptest::collection::vec;
use proptest::arbitrary::{any, Arbitrary};

//==============================================================================
// Rand
//==============================================================================

use rand::Rng;

//==============================================================================
// Regex syntax
//==============================================================================

use regex_syntax::{Parser, Error as ParseError};
use regex_syntax::hir::{
    self, Hir, HirKind::*, Literal::*,
    // I mix the forms somewhat freely here based on what makes sense:
    RepetitionKind::{self, *}, RepetitionRange::*
};

//==============================================================================
// Internal modules
//==============================================================================

use collection::SizeRange;
use num::sample_uniform_incl;
use strategy::*;
use test_runner::*;

However, I would primarily use this style of "section-comments" if there are a tonne of imports to reduce chaos in the current file. I would not suggest adopting this as the idomatic style ;) but rustfmt also won't strip these comments, so I don't think they would go against the idiom either.

mark-i-m commented 5 years ago

I don't tend to use { .. } too much but I also don't eschew it entirely...

I actually like to have imports "maximally factored" using nested { .. }. I find it nicely compact and more readable. I don't really like having a really long section at the beginning of files with lots of imports.

Centril commented 5 years ago

@mark-i-m hehe - I personally think compactness in this instance makes it too cluttered and so less readable, but I think that's very much a matter of taste. :) I think perhaps the more un-opinionated idiom, which I think fits both of our styles is that "there should be grouping of imports based on where they come from and at least standard library imports, external crates, and local imports should be separated". I think where we differ is mostly how we visually delineate grouping?

(I'm not super principled here and I think I could be won over...)

mark-i-m commented 5 years ago

@Centril Yes, I think we are not actually that far apart. I suppose my use of "compact" is not exactly right. I do like compactness, but more than that I'm really looking for "less noisy". For example, I would prefer the example you posted as follows:

use std::{
  mem::{size_of, align_of},
  ptr,
};

use proptest::{
  strategy::{Strategy, BoxedStrategy},
  collection::vec,
  arbitrary::{any, Arbitrary}
};
use rand::Rng;
use regex_syntax::{
  Parser,
  Error as ParseError,
  hir::{
    self,
    Hir,
    HirKind::*,
    Literal::*,
    RepetitionKind::{self, *},
    RepetitionRange::*
  },
};

use collection::SizeRange;
use num::sample_uniform_incl;
use strategy::*;
use test_runner::*;
Centril commented 5 years ago

@mark-i-m yeah that's pretty close I think; I might slightly favor use of a bit more horizontal real estate (<= 80 characters) than vertical estate. I think perhaps tho we should not legislate too tightly here and allow for some artistic freedom? I think we can agree that the most important part is the separation between std/external/internal?

KiChjang commented 5 years ago

I'm not sure if anyone else has raised this observation before, but the anchored paths variant (which is what I prefer) contains a very nice property. Consider the following:

use a::b::c;

If a is none of crate, super or self, then I know immediately that a is the name of an external crate, and I wouldn't have to spend time looking in a potentially large file where mod a is defined. My other arguments for anchored paths is pretty much summed up very well by @steveklabnik's link to the reddit comment (in particular point 2).

And perhaps this "1path" ideal has been elaborated and discussed in depth elsewhere, but I am not really aware of all of its benefits and its usefulness, even after reading @withoutboats' blog post. The question that I'm really wanting to answer is this: does it make sense to stick to this rule as closely as possible? Can someone please explain in detail? Thank you.

KiChjang commented 5 years ago

Also, with the addition of nested imports, I can imagine myself writing the following:

use self::{
    a::b::c,
    foo::{Bar, Baz},
};

... so the additional overhead of typing self:: doesn't really bother me.

EDIT: In fact, this turns out to be an advantage of anchored paths, because you can't write this in uniform paths!

ghost commented 5 years ago

@KiChjang

use a::b::c;

If a is none of crate, super or self, then I know immediately that a is the name of an external crate, and I wouldn't have to spend time looking in a potentially large file where mod a is defined.

In the uniform paths variant, instead of searching the large module file for mod a, you could just look at the Cargo.toml file. And if there's an external dependency named a, then that's what the import must be referring to, because if there were also a local symbol a, then that import would be an ambiguity error. And if there's no dependency a in Cargo.toml, then you know a must be a local symbol, in this case a module. But you're right in that with anchored paths variant all that is immediately obvious.

use self::{
    a::b::c,
    foo::{Bar, Baz},
};

... EDIT: In fact, this turns out to be an advantage of anchored paths, because you can't write this in uniform paths!

You can write that in the uniform paths variant. You can write imports in the uniform paths variant exactly the same way as you would write them in the anchored paths variant. The only thing uniform paths variant does different from the anchored paths variant is that it allows you to start an import path with a local symbol (given that there's no extern crate with the same name), just like you can in paths elsewhere.

eddyb commented 5 years ago

EDIT: In fact, this turns out to be an advantage of anchored paths, because you can't write this in uniform paths!

You can write it like that or without self::, both work.

zrzka commented 5 years ago

In the uniform paths variant, instead of searching the large module file for mod a, you could just look at the .toml file.

Hmm, why am I forced to check the .toml file? I still think it should be enough to look at the source code without looking at the .toml file to get info if it's a submodule, ... or an external crate. If you're a crate author, you know it, but rest is forced to check multiple places. Reading foo.rs, ah, there's bar, is this submodule or an external crate? Let's check long, unsorted .toml dependencies, ah, no bar here, let's check foo.rs again if it's inside somewhere (mod bar) or if there's a file named bar.rs, etc. One argument was that people don't understand self::. But this adds another burden, much more worse than self:: IMO.

kornelski commented 5 years ago

If you know about a project so little that you don't even know if something is a module or a crate, you're very unlikely to even know how to use that crate/module. But then:

cargo doc --open

will show you modules and crates, with search, and info how to use them. So I think for getting an idea of how code is it's already a better solution than reading imports and toml files.

There's another case when you know the code, but you don't know whether in certain file/context a name refers to the crate or a module. In the discussion doc @withoutboats said they've got gc crate for low-level GC stuff, and gc module for GC wrapper type.

IMHO this is a case of ambiguous naming, and could be solved by being more specific, e.g. the low-level GC could be called llgc and the type wrapper could be gcty (or something more verbose if you like). Here the key distinction is not whether it's a crate or module — that's just a side-effect of arbitrary organization. If both were put in crates or both were put in modules, that would be a problem. Here the key difference, inherent in the semantics/purpose of the code, is whether it's the low-level or high-level part of gc, so IMHO it makes more sense to just reflect that in the name.

zrzka commented 5 years ago

If you know about a project so little that you don't even know if something is a module or a crate, you're very unlikely to even know how to use that crate/module.

It's not about crate/module usage or doc generation. Let me rephrase it.

Imagine this - you're a new person to Rust (not necessarily), you found some crate and you'd like to learn how some thing is done. And this thing is not covered with docstring, not in public API, ... you're just skimming through the code. And this ambiguity makes it harder, because as I wrote, you have to find what use foo is and where it is. Imagine you're skimming through the code on iPad @ GitHub, browser, ... Maybe I'm dumb, because I just don't understand how this helps at all. Experienced people can use anything, but try to think about new people, skimming through code, learning, ...

P.S. I can live with both ways, I'm just trying to argue for new people :)

tyranron commented 5 years ago

There's another case when you know the code, but you don't know whether in certain file/context a name refers to the crate or a module. In the discussion doc @withoutboats said they've got gc crate for low-level GC stuff, and gc module for GC wrapper type.

IMHO this is a case of ambiguous naming, and could be solved by being more specific, e.g. the low-level GC could be called llgc and the type wrapper could be gcty (or something more verbose if you like). Here the key distinction is not whether it's a crate or module — that's just a side-effect of arbitrary organization. If both were put in crates or both were put in modules, that would be a problem. Here the key difference, inherent in the semantics/purpose of the code, is whether it's the low-level or high-level part of gc, so IMHO it makes more sense to just reflect that in the name.

Uniform paths clearly tend to be more ergonomic, advances refactoring and are more elegant and universal from theoretical point of view. But... that ambiguity really makes me scary.

Sometimes we can control naming, sometimes we cannot, and sometimes code author may do not care about this so much, and we're all just humans, so tend to be error-prone. If there is an ambiguity possibility then it will happen anyway at some right time to make all the things in the worst way.

I would sleep better if I knew that some kind of bad things cannot happen by design. This is what I love Rust for.