rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
98.23k stars 12.7k forks source link

Tracking issue for RFC 2523, `#[cfg(accessible(::path::to::thing))]` #64797

Open Centril opened 5 years ago

Centril commented 5 years ago

This is a tracking issue for #[cfg(accessible(::path::to::thing))] (rust-lang/rfcs#2523).

Steps

Status

From this comment

Unresolved questions:

None so far.

pickfire commented 5 years ago

@Centril I am interested in doing this, I have never contributed to rust code and would like to try. Any guidance?

Centril commented 5 years ago

@pickfire Cool!

I believe the logic here can be divided into 2 parts roughly:

  1. Syntax:

    1. Add a new sym::accessible in https://doc.rust-lang.org/nightly/nightly-rustc/src/syntax_pos/symbol.rs.html#22.

    2. Feature gate accessible in GATED_CFGS. Also add cfg_accessible to active.rs: https://doc.rust-lang.org/nightly/nightly-rustc/src/syntax/feature_gate/active.rs.html#530. This will also require a new sym::cfg_accessible.

    3. Introduce a match arm for sym::accessible in cfg_matches. This should look mostly like the case for sym::not.

      Here you need to extract an &ast::Path and delegate the interpretation to a function of the rough shape fn is_accessible(sess: &ParseSess, path: &Path) -> bool { ... }

  2. Implement fn is_accessible.

    1. First do some validation. We want to emit errors if !path.is_global(). Use sess.span_diagnostic.struct_span_err.

    2. At this point we have a cycle between libsyntax (conditional compilation code where cfg_matches lives) and e.g. librustc_resolve where the latter has the information we need. To fix this we will need to enrich ParseSess or a modified CrateConfig (as a struct which would include the type alias) with a mechanism to ask if the path is accessible or not. Alternatively you could use some extern "Rust" { fn is_path_accessible(...) -> bool; } to achieve the same idea.

      Points to remember (and test for) when implementing this:

      • A path must not only exist but also be publicly exported from the target crate.

      • Feature gates should be considered in this; check whether the required feature gate is active if the feature is #[unstable(...)].

      • The bit in https://github.com/rust-lang/rfcs/blob/master/text/2523-cfg-path-version.md#inherent-implementations probably requires interactions with librustc_typeck (tho that can be fixed in a later stage of implementation)?

      • The various subsections in the reference section for accessible will need tests to make sure they work right.

      • Here we probably will want to assume that crate metadata for the ::target exists so if target is not a dependency of the current crate then we will answer false. If we don't do this I believe we would run into time-travel / cycle problems.

      • https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/struct.Resolver.html#method.per_ns is probably going to be used somewhere.

      • A question I thought of just now: If ::target::thing is deprecated, does that result in a warning in #[cfg(::target::thing)]?

      At this point I probably know too little about metadata and resolution to provide mentoring notes so I hope @eddyb, @petrochenkov, or maybe @mw can help you out.

petrochenkov commented 5 years ago

Oh no, this RFC passed.

petrochenkov commented 5 years ago

We can reliably answer the accessible(PATH) query for module-relative paths from other crates.

For type-relative paths everything looks harder.

Centril commented 5 years ago

For type-relative paths everything looks harder.

Agreed!

  • Second, even for paths from other crates to resolve a type-relative path we need to know what traits are in scope at the cfg point in this crate.

Can you elaborate on this point a bit? What specifically induces this requirement re. this crate and are there syntactic restrictions on the path we can impose to prevent the requirement from arising? (For example by preventing Foo::Bar::<T> syntactically.)

  • Additionally we need to know what trait aliases are in scope and be able to go from aliases to traits at cfg-expansion time.

Hmm; I don't think I follow how why we need to go from aliases to their bodies here... can you elaborate on this as well?

petrochenkov commented 5 years ago

@Centril

Can you elaborate on this point a bit? What specifically induces this requirement re. this crate and are there syntactic restrictions on the path we can impose to prevent the requirement from arising?

use std::io::Write;

// The cfg result depends on whether the `io::Write` trait is in scope here
#[cfg(::std::fs::File::write)]

There's no syntactic way to detect this.

I don't think I follow how why we need to go from aliases to their bodies here... can you elaborate on this as well?

trait Alias = std::io::Write;

// The cfg result depends on whether the `Alias` refers to `io::Write`
#[cfg(::std::fs::File::write)]
eddyb commented 5 years ago

For type-relative paths, I don't think we should do anything more than inherent impls (because otherwise it's really cfg on a trait bound at best and I doubt that was ever intended by the RFC).

For example by preventing Foo::Bar::<T> syntactically.

There should be no generic arguments on such paths anyway, because there is no machinery to interpret them in any meaningful way.


I'd suggest implementing only module-relative paths to begin with, and do more investigations for type-relative paths, since I suspect they're not as useful. Note that "module-relative" includes Trait::associated_item, since Trait is module-like for name resolution, so it would work for usecases like "did Iterator::foo get added upstream".

petrochenkov commented 5 years ago

The unpleasant part is that for some paths we may give an incorrect answer if we don't consider type-relative resolutions, e.g. using module-relative resolution we may return false for

cfg(MyEnum::MaybeNonExistent)

or

cfg(MyTrait::maybe_non_existent)

even if both paths exist if we apply type-relative resolution.

For imports those would also return "false", but in that case it's a compilation error (unresolved import), rather than a silent un-configuration. We can report an error if the path in cfg is potentially type-resolved, that kinda reduces the usefulness of the feature though.

petrochenkov commented 5 years ago

Module-relative resolutions also have complications. Consider this case, for example:

#[cfg(::proc_macro::TokenStream)]
struct S;

macro_rules! m { () => { extern crate proc_macro; } }

m!();

If we expand the cfg immediately like all cfgs are currently expanded, then we'll return an incorrect answer - false.

That means the path-based cfgs need to be enqueued until they are ready/resolved and only then expanded. (That's exactly the behavior of attribute macros, and it comes with some costs - some restrictions apply to entities like macros or imports if they are expanded from macros. Same restrictions don't apply to cfgs, which are expanded immediately.)

The conclusion is that supporting paths from other crates is not easier than supporting local paths. On the bright side, that means supporting local paths is not harder than extern paths! (Too bad local paths are useless in this context. EDIT: At least on 2018 edition, on 2015 edition everything is a local path.)

pickfire commented 5 years ago

@Centril Thanks for the explanation and the quick response, I do quite understand what you meant but not quite for the others.

Based on my understanding, #[cfg(accessible(::type::alias))] still probably does not work for this?

pickfire commented 5 years ago

Introduce a match arm for sym::accessible in cfg_matches. This should look mostly like the case for sym::not.

Here you need to extract an &ast::Path and delegate the interpretation to a function of the rough shape fn is_accessible(sess: &ParseSess, path: &Path) -> bool { ... }

@Centril Does that mean that I need to add something like

diff --git a/src/libsyntax/ast.rs b/src/libsyntax/ast.rs
index 8b967048848..9b6373bfb42 100644
--- a/src/libsyntax/ast.rs
+++ b/src/libsyntax/ast.rs
@@ -514,6 +514,10 @@ pub enum MetaItemKind {
     ///
     /// E.g., `feature = "foo"` as in `#[feature = "foo"]`.
     NameValue(Lit),
+    /// Value meta item.
+    ///
+    /// E.g., `path` as in `accessible(path)`.
+    Value(Lit),
 }

 /// A block (`{ .. }`).
mx00s commented 4 years ago

Will this support paths to modules, e.g. #[cfg(not(accessible(std::todo)))]? That's a silly example because todo! can be shadowed by a new (polyfill) implementation.

I don't have any more realistic use cases in mind yet; just curious.

nagisa commented 4 years ago

How would

#[cfg(accessible(peach::Peach))]
mod banana {
    crate struct Banana;
}

#[cfg(accessible(banana::Banana))]
mod peach {
    crate struct Peach;
}

work? Given comments from @petrochenkov can we restrict this feature for now (and maybe indefinitely) to support external paths only and nothing related to type-relative resolution?

EDIT: I’ve been pointed to the RFC portion that mentions paths must refer to external crates.

I think that at very least makes type-relative trait resolution that was raised as a concern in https://github.com/rust-lang/rust/issues/64797#issuecomment-540606496 a non-problem. For a use somecrate::Trait; somecrate::Type::method is effectively a use somecrate::Trait; <somecrate::Type as Trait>::method where Trait in the path is crate-local pretty much.

petrochenkov commented 4 years ago

FWIW, I work on implementing a prototype for this in the most conservative variant (https://github.com/petrochenkov/rust/tree/cfgacc), but I've been constantly distracted by job and holiday trips since mid December.

The prototype will work as a macro attribute rather than a cfg mode, will make no distinction between local and external paths and will report a "not sure whether this exist or not" error on any potential type-relative associated item. (The @nagisa's example https://github.com/rust-lang/rust/issues/64797#issuecomment-571131376 should be automatically treated as a cycle in macro resolution/expansion and produce an error.)

petrochenkov commented 4 years ago

FWIW, I work on implementing a prototype for this in the most conservative variant

PR submitted - https://github.com/rust-lang/rust/pull/69870.

petrochenkov commented 4 years ago

Status:

joshtriplett commented 4 years ago

This came up in the @rust-lang/cargo meeting today. If you've specified conditionals on a dependency, such as with [target.'cfg(some_platform)'.dependencies], you might not want to duplicate that information into corresponding cfg directives when using that dependency. It'd be nice to, instead, write #[cfg(accessible(dep_name))], which would be true whenever that dep was available.

Would it be potentially reasonable, and avoid the trait-path corner cases, to only allow #[cfg(accessible(single_token))] for now? You could still write use std::foo::bar; #[cfg(accessible(bar))] item;, but you couldn't access methods that way (trait method or otherwise), so type-specific resolution shouldn't be an issue. Would that work?

petrochenkov commented 4 years ago

Reporting an error on paths that can potentially point to associated items shouldn't be much harder than reporting an error on multi-segment paths, so the single-segment restriction would mostly be artificial. It can be done as an intermediate step though, if the second item in https://github.com/rust-lang/rust/issues/64797#issuecomment-625803760 (which is the current blocker) is going to be (partially) implemented by someone unfamiliar with resolve.

m-ou-se commented 4 years ago

You could still write use std::foo::bar; #[cfg(accessible(bar))] item;

Wouldn't that use fail directly if that bar wasn't accessible?

coolreader18 commented 4 years ago

If I'm understanding correctly, that's the point; it would avoid the question of trait/type paths and just implement a sort of cfg alias as an MVP.

joshtriplett commented 4 years ago

@petrochenkov wrote:

Reporting an error on paths that can potentially point to associated items shouldn't be much harder than reporting an error on multi-segment paths, so the single-segment restriction would mostly be artificial. It can be done as an intermediate step though, if the second item in #64797 (comment) (which is the current blocker) is going to be (partially) implemented by someone unfamiliar with resolve.

I read that message, and what I was wondering was whether that second item would be substantially simpler to implement and stabilize if it doesn't have to take trait methods or any other type-dependent resolution into account.

petrochenkov commented 4 years ago

@joshtriplett It already doesn't have to account for type-dependent resolution, it just has to report an error if type-dependent resolutions are potentially possible.


Accounting for type-dependent resolutions would require emitting some new kind of where clauses resolved during type checking, like

#[cfg(accessigle(Type::dependent))]
struct S { ... }

=>

// Name `S` is still defined, but referring to it is an error if the where predicate ends up being false during type checking, similarly to false bounds with `#![feature(trivial_bounds)]`.
struct S where exists(Type::dependent) {
    ...
}
thomcc commented 4 years ago

A really amazing use case for this IMO is -sys crates For example: libc, currently i have code like this in a current project that does some unixey nonsense:

I briefly almost considered having doing an autoconf-style "do a test build to check existence", expose --cfg have_<c function>.

But more generally: When you consider this in conjunction with performing bindgen, it gets pretty powerful, and makes it vastly easier to support C apis that gain functionality over time (for example, I'd love to use that inside rusqlite as well).

coolreader18 commented 4 years ago

Yup, that's my sorta use case as well: the _os module in RustPython has a bunch of wrappers around libc functions, and it'd be really great if we could just do #[cfg(accessible(libc::foo))] fn os_foo(x: i32) -> i32 { libc::foo(x) }, rather than copying cfg()s from libc that might become inaccurate later. Also for libc constants: it'd be really nice if we could just create a map of String->i32 of constants just by passing the names of all the ones that might exist to a macro.

est31 commented 3 years ago

There is a bit of discussion about cfg(accessible) in the cfg(version) tracking issue: https://github.com/rust-lang/rust/issues/64796#issuecomment-811953979 , see especially the proposal for a more limited and forwards compatible version of cfg(accessible): https://github.com/rust-lang/rust/issues/64796#issuecomment-844650096

Re the forward compatibility concern with type dependent paths outlined in https://github.com/rust-lang/rust/issues/64797#issuecomment-540610042

We can report an error if the path in cfg is potentially type-resolved, that kinda reduces the usefulness of the feature though.

It seems to me that in order to support types at all, one has to parse pub enum SomeEnum anyways, right? It wouldn't be hard to record that SomeEnum is a type and thus anything that comes after SomeEnum must be a type dependent path. So we could error here, no? A name overlap of a module and a type can't occur either as they live in the same namespace.

nikomatsakis commented 3 years ago

I would be quite satisfied with a std/alloc/core version of this for the time being, yes.

thomcc commented 3 years ago

Solving it only for libstd/core/alloc solves the issue for people trying to support old versions of rustc, but I'd argue that's a less common use case than people trying to use conditionally-available functionality in other cases.

As mentioned, it's pretty annoying to use libc without this.

joshtriplett commented 3 years ago

@thomcc I'd absolutely like to see this available for general-purpose use. However, in the short term, the question is "how much of this needs to be available so that people aren't driven towards rustc version detection instead". And for that purpose, it's sufficient to be able to see what's available in the libraries that ship with rustc.

est31 commented 3 years ago

@joshtriplett oh by initially supporting std plus prelude you meant builtin prelude, not the 2018 crates prelude (crates passed via --extern param). Note that supporting the latter isn't any harder really than supporting the builtin prelude only. The problem only arises with modules that are part of the same crate, or stuff like proc_macro that needs an extern crate statement first, as those can be gated by their own cfg statements.

I wonder if it would be possible to get mentorship instructions written for this? I suppose in general there are four tasks: changing the feature to reject crates not from the builtin or extern prelude, splitting of the restricted feature as a separate feature gate so that it can be stabilized earlier, rejecting any path where there is a type that's not at the final position with an error, implementing the actual path searching feature.

petrochenkov commented 2 years ago

https://github.com/rust-lang/rust/pull/97391 improves the implementation to more usable state.

clarfonthey commented 2 years ago

One particular pain point I think the above still doesn't address: this currently doesn't work inside macros, since it receives meta variables unresolved. Not sure whether you'd classify this as a complete blocker of stabilisation or not.

joshtriplett commented 2 years ago

@clarfonthey That could potentially be handled by indirecting through another level of macro to force expansion.

joshtriplett commented 2 years ago

It sounds like cfg_accessible can do quite a bit now, and can see any item, just not methods or similar.

That seems potentially useful enough to consider stabilizing its current functionality.

(Is there any reason we can't name this cfg(accessible(...))?)

joshtriplett commented 2 years ago

(Tagging as impl-incomplete, but that shouldn't block us from stabilizing what's there.)

joshtriplett commented 2 years ago

libs-api: does checking for top-level items seem sufficiently useful to motivate stabilizing this?

joshtriplett commented 2 years ago

We didn't get to this in today's @rust-lang/libs-api meeting, but several individual folks from the team looked at it and enthusiastically expressed support for having it. The capabilities it already has would solve many issues for us.

petrochenkov commented 2 years ago

Is there any reason we can't name this cfg(accessible(...))?

We potentially can, but it will need to expand to something like #[cfg_accessible] anyway and won't behave like other cfgs.

Macro attributes can be enqueued and retried when their expansion fails (and ability to do that is crucial for cfg_accessible), but regular cfgs need to be able to expand immediately.

petrochenkov commented 2 years ago

Path resolutions in cfg_accessible are not currently shielded against "time travel", unlike paths in macros and imports.

It would be good to implement such shielding before the stabilization, otherwise it can potentially cause some breakage later on, although it shouldn't be large. I don't currently have time to work on this, but I can point interested people to the relevant code, please write me a message on Zulip if you are interested.

clarfonthey commented 2 years ago

I'm not in favour of calling it cfg(accessible(...)) unless it can actually work like a cfg would. Not 100% sure on what the actual requirements for that are, though.

thomcc commented 2 years ago

If it's not cfg(accessible), it would be much more painful to use with other cfgs, since you need to do things like #[cfg_attr(any(foo, bar), cfg_accessible(...))], which kind of sucks.

joshtriplett commented 2 years ago

Right, the main reason for cfg(accessible) would be to allow combining with any or all or similar.

est31 commented 2 years ago

An any/all free cfg(accessible) would unlock the sys crate use case: https://github.com/rust-lang/rust/issues/64797#issuecomment-706286311

And I think many more use cases don't need any combinators.

Also, one can support all at least, by making the cfg(accessible) feature desugar cfg(all(foo, bar, all(accessible(a), baz), accessible(b))) to cfg_attr(all(foo, bar, baz), cfg_accessible(a, b)) (pending compiler support for cfg_accessible(a, b)). foo, bar and baz could even contain any and all and such, as long as the criterion is upheld that any cfg(accessible) is reachable from the root of the cfg tree by only visiting all nodes (and no not or any ones). One could relax the criterion even further to "the clauses in the conjunctive normal form representation of the tree containing cfg(accessible) are naked atoms". This would allow stuff like cfg(not(any(foo, not(accessible(a))))).

With all supported, I wonder which use case would need any or not. Generally you do cfg(accessible) right before using a feature you tested accessibility for, so if it evals to false then you would likely not want to use the feature.

Nemo157 commented 2 years ago

not would be needed if you were doing something like providing a polyfill when the feature is not available.

thomcc commented 2 years ago

Yeah, not seems extremely important to me, and hard to work around its absence. Lack of any/all could definitely be worked around, though, just it would be tedious.

m-ou-se commented 2 years ago

A really amazing use case for this IMO is -sys crates For example: libc, currently i have code like this in a current project that does some unixey nonsense:

I'm not sure I agree that's a good use case for this feature. Using cfg(accessible) for that quickly results in hard-to-maintain code that changes behavior depending on whether a symbol exists, without any checks for typos or signatures. There would be all kind of very implicit contracts between crates, similar to what we'd have if we relied on duck-typing everywhere instead of well defined traits and types.

thomcc commented 2 years ago

What would your suggested alternative be?

Currently, the only alternative I know of is to maintain a large list of cfgs for which targets support (e.g. getloadavg or sysctlbyname in the example), and I'm not sure I think that's that's any more maintainable.

m-ou-se commented 2 years ago

Another thing I'm worried about this feature enabling or even encouraging code that depends on future versions of the standard library that don't even exist yet.

For example: If we're experimenting with an unstable type X in the standard library, someone might reasonably write code like this to allow for a seamless transition for the future if/when std::X gets stabilized:

#[cfg(accessible(std::X))]
pub type X = std::X;

#[cfg(not(accessible(std::X)))]
pub struct X { .. }

The problem with this 'future compatible' code is that it assumes the future stable std::X type will be compatible with their own struct X. If we decide to change the unstable API in some breaking ways before stabilization (which we regularly do), stabilization would break this code, even though they were not using nightly Rust or unstable #![feature]s.

We could simply blame the author of this code for using cfg(accessible) like this, but if this is part of a popular library, all their users now have a reason to not upgrade to a new version of Rust, since we broke their dependency in the new Rust version.

thomcc commented 2 years ago

That's a good point, and might be a point in favor of something like a #[cfg(version(...))] instead, which seems much harder to abuse in that way (short of making a called shot about when the feature will land, which seems unlikely to happen frequently enough to matter).

That said, I believe this kind of version checking is regarded as undesirable for various reasons as well (although it's widely used in crates with long MSRV requirements).

m-ou-se commented 2 years ago

What would your suggested alternative be?

Currently, the only alternative I know of is to maintain a large list of cfgs for which targets support (e.g. getloadavg or sysctlbyname in the example), and I'm not sure I think that's that's any more maintainable.

Ideally, something that interacts well with our type and trait system, with where-clauses. E.g. an impl GetLoadAvg for System, such that a function with a where System: GetLoadAverage can call System::getloadavg(). Then it can also be type checked properly on all systems, instead of only on those where the cfg()s are enabled. That'd also prevent a lot of the cases where we accidentally break tier 3 platforms.

I realize that this requires some new language features, and probably some more to make it anywhere close to ergonomic. I really dislike conditional compilation, as it results in similar surprises and headaches as e.g. C++ templates (which are only type checked when instantiated) or type issues in dynamically typed scripting languages. I think conditional compilation is unavoidable in some cases, but I don't think we should encourage more of it.

I'm not sure if I see cfg(accessible) as something to just improve existing #[cfg]s, or as something that encourages even more (and more fragile) #[cfg]s.

thomcc commented 2 years ago

I also have noted in the past that cfg(accessible) can seem like the right solution to a problem, but passing the wrong argument to accessible might end up with the code not running when expected. It's certainly a problem.

I also agree something like what I suggested probably wouldn't be the right fit for use inside libstd, but I don't think I take as hard of a line against conditional compilation. It does make code harder to maintain, but it also comes with a number of benefits (compile time, for example, even if just in terms of not needing to pull in things like windows-sys or objc when building for linux).

I am in favor of ways to avoid the need for it, though, especially inside the stdlib.

(I'm not sure I follow how your suggestion avoids the current situation, but am willing to accept that it may be slightly handwaved given the "this requires some new language features" bit)