Open Centril opened 5 years ago
@Centril I am interested in doing this, I have never contributed to rust code and would like to try. Any guidance?
@pickfire Cool!
I believe the logic here can be divided into 2 parts roughly:
Syntax:
Add a new sym::accessible
in https://doc.rust-lang.org/nightly/nightly-rustc/src/syntax_pos/symbol.rs.html#22.
Feature gate accessible
in GATED_CFGS
. Also add cfg_accessible
to active.rs
: https://doc.rust-lang.org/nightly/nightly-rustc/src/syntax/feature_gate/active.rs.html#530. This will also require a new sym::cfg_accessible
.
Introduce a match arm for sym::accessible
in cfg_matches
. This should look mostly like the case for sym::not
.
Here you need to extract an &ast::Path
and delegate the interpretation to a function of the rough shape fn is_accessible(sess: &ParseSess, path: &Path) -> bool { ... }
Implement fn is_accessible
.
First do some validation. We want to emit errors if !path.is_global()
. Use sess.span_diagnostic.struct_span_err
.
At this point we have a cycle between libsyntax
(conditional compilation code where cfg_matches
lives) and e.g. librustc_resolve
where the latter has the information we need. To fix this we will need to enrich ParseSess
or a modified CrateConfig
(as a struct which would include the type alias) with a mechanism to ask if the path is accessible or not. Alternatively you could use some extern "Rust" { fn is_path_accessible(...) -> bool; }
to achieve the same idea.
Points to remember (and test for) when implementing this:
A path must not only exist but also be publicly exported from the target crate.
Feature gates should be considered in this; check whether the required feature gate is active if the feature is #[unstable(...)]
.
The bit in https://github.com/rust-lang/rfcs/blob/master/text/2523-cfg-path-version.md#inherent-implementations probably requires interactions with librustc_typeck (tho that can be fixed in a later stage of implementation)?
The various subsections in the reference section for accessible
will need tests to make sure they work right.
Here we probably will want to assume that crate metadata for the ::target
exists so if target
is not a dependency of the current crate then we will answer false
. If we don't do this I believe we would run into time-travel / cycle problems.
https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/struct.Resolver.html#method.per_ns is probably going to be used somewhere.
A question I thought of just now: If ::target::thing
is deprecated, does that result in a warning in #[cfg(::target::thing)]
?
At this point I probably know too little about metadata and resolution to provide mentoring notes so I hope @eddyb, @petrochenkov, or maybe @mw can help you out.
Oh no, this RFC passed.
We can reliably answer the accessible(PATH)
query for module-relative paths from other crates.
For type-relative paths everything looks harder.
cfg
point in this crate.
This means merging more of the late resolution into early resolution.For type-relative paths everything looks harder.
Agreed!
- Second, even for paths from other crates to resolve a type-relative path we need to know what traits are in scope at the
cfg
point in this crate.
Can you elaborate on this point a bit? What specifically induces this requirement re. this crate and are there syntactic restrictions on the path we can impose to prevent the requirement from arising? (For example by preventing Foo::Bar::<T>
syntactically.)
- Additionally we need to know what trait aliases are in scope and be able to go from aliases to traits at cfg-expansion time.
Hmm; I don't think I follow how why we need to go from aliases to their bodies here... can you elaborate on this as well?
@Centril
Can you elaborate on this point a bit? What specifically induces this requirement re. this crate and are there syntactic restrictions on the path we can impose to prevent the requirement from arising?
use std::io::Write;
// The cfg result depends on whether the `io::Write` trait is in scope here
#[cfg(::std::fs::File::write)]
There's no syntactic way to detect this.
I don't think I follow how why we need to go from aliases to their bodies here... can you elaborate on this as well?
trait Alias = std::io::Write;
// The cfg result depends on whether the `Alias` refers to `io::Write`
#[cfg(::std::fs::File::write)]
For type-relative paths, I don't think we should do anything more than inherent impls (because otherwise it's really cfg
on a trait bound at best and I doubt that was ever intended by the RFC).
For example by preventing
Foo::Bar::<T>
syntactically.
There should be no generic arguments on such paths anyway, because there is no machinery to interpret them in any meaningful way.
I'd suggest implementing only module-relative paths to begin with, and do more investigations for type-relative paths, since I suspect they're not as useful.
Note that "module-relative" includes Trait::associated_item
, since Trait
is module-like for name resolution, so it would work for usecases like "did Iterator::foo
get added upstream".
The unpleasant part is that for some paths we may give an incorrect answer if we don't consider type-relative resolutions, e.g. using module-relative resolution we may return false
for
cfg(MyEnum::MaybeNonExistent)
or
cfg(MyTrait::maybe_non_existent)
even if both paths exist if we apply type-relative resolution.
For imports those would also return "false", but in that case it's a compilation error (unresolved import), rather than a silent un-configuration.
We can report an error if the path in cfg
is potentially type-resolved, that kinda reduces the usefulness of the feature though.
Module-relative resolutions also have complications. Consider this case, for example:
#[cfg(::proc_macro::TokenStream)]
struct S;
macro_rules! m { () => { extern crate proc_macro; } }
m!();
If we expand the cfg
immediately like all cfg
s are currently expanded, then we'll return an incorrect answer - false
.
That means the path-based cfgs need to be enqueued until they are ready/resolved and only then expanded. (That's exactly the behavior of attribute macros, and it comes with some costs - some restrictions apply to entities like macros or imports if they are expanded from macros. Same restrictions don't apply to cfgs, which are expanded immediately.)
The conclusion is that supporting paths from other crates is not easier than supporting local paths. On the bright side, that means supporting local paths is not harder than extern paths! (Too bad local paths are useless in this context. EDIT: At least on 2018 edition, on 2015 edition everything is a local path.)
@Centril Thanks for the explanation and the quick response, I do quite understand what you meant but not quite for the others.
Based on my understanding, #[cfg(accessible(::type::alias))]
still probably does not work for this?
Introduce a match arm for
sym::accessible
in cfg_matches. This should look mostly like the case forsym::not
.Here you need to extract an &ast::Path and delegate the interpretation to a function of the rough shape
fn is_accessible(sess: &ParseSess, path: &Path) -> bool { ... }
@Centril Does that mean that I need to add something like
diff --git a/src/libsyntax/ast.rs b/src/libsyntax/ast.rs
index 8b967048848..9b6373bfb42 100644
--- a/src/libsyntax/ast.rs
+++ b/src/libsyntax/ast.rs
@@ -514,6 +514,10 @@ pub enum MetaItemKind {
///
/// E.g., `feature = "foo"` as in `#[feature = "foo"]`.
NameValue(Lit),
+ /// Value meta item.
+ ///
+ /// E.g., `path` as in `accessible(path)`.
+ Value(Lit),
}
/// A block (`{ .. }`).
Will this support paths to modules, e.g. #[cfg(not(accessible(std::todo)))]
? That's a silly example because todo!
can be shadowed by a new (polyfill) implementation.
I don't have any more realistic use cases in mind yet; just curious.
How would
#[cfg(accessible(peach::Peach))]
mod banana {
crate struct Banana;
}
#[cfg(accessible(banana::Banana))]
mod peach {
crate struct Peach;
}
work? Given comments from @petrochenkov can we restrict this feature for now (and maybe indefinitely) to support external paths only and nothing related to type-relative resolution?
EDIT: I’ve been pointed to the RFC portion that mentions paths must refer to external crates.
I think that at very least makes type-relative trait resolution that was raised as a concern in https://github.com/rust-lang/rust/issues/64797#issuecomment-540606496 a non-problem. For a use somecrate::Trait; somecrate::Type::method
is effectively a use somecrate::Trait; <somecrate::Type as Trait>::method
where Trait
in the path is crate-local pretty much.
FWIW, I work on implementing a prototype for this in the most conservative variant (https://github.com/petrochenkov/rust/tree/cfgacc), but I've been constantly distracted by job and holiday trips since mid December.
The prototype will work as a macro attribute rather than a cfg
mode, will make no distinction between local and external paths and will report a "not sure whether this exist or not" error on any potential type-relative associated item.
(The @nagisa's example https://github.com/rust-lang/rust/issues/64797#issuecomment-571131376 should be automatically treated as a cycle in macro resolution/expansion and produce an error.)
FWIW, I work on implementing a prototype for this in the most conservative variant
PR submitted - https://github.com/rust-lang/rust/pull/69870.
Status:
#[cfg_accessible(path)] item
. The attribute can configure or unconfigure the item
and wait until the predicate "path
is accessible" becomes determinate.
#[cfg(accessible)]
into #[cfg_accessible]
is not implemented, we need to consider doing or not doing it only when everything else is implemented.This came up in the @rust-lang/cargo meeting today. If you've specified conditionals on a dependency, such as with [target.'cfg(some_platform)'.dependencies]
, you might not want to duplicate that information into corresponding cfg
directives when using that dependency. It'd be nice to, instead, write #[cfg(accessible(dep_name))]
, which would be true whenever that dep was available.
Would it be potentially reasonable, and avoid the trait-path corner cases, to only allow #[cfg(accessible(single_token))]
for now? You could still write use std::foo::bar; #[cfg(accessible(bar))] item;
, but you couldn't access methods that way (trait method or otherwise), so type-specific resolution shouldn't be an issue. Would that work?
Reporting an error on paths that can potentially point to associated items shouldn't be much harder than reporting an error on multi-segment paths, so the single-segment restriction would mostly be artificial. It can be done as an intermediate step though, if the second item in https://github.com/rust-lang/rust/issues/64797#issuecomment-625803760 (which is the current blocker) is going to be (partially) implemented by someone unfamiliar with resolve.
You could still write
use std::foo::bar; #[cfg(accessible(bar))] item;
Wouldn't that use
fail directly if that bar
wasn't accessible?
If I'm understanding correctly, that's the point; it would avoid the question of trait/type paths and just implement a sort of cfg alias as an MVP.
@petrochenkov wrote:
Reporting an error on paths that can potentially point to associated items shouldn't be much harder than reporting an error on multi-segment paths, so the single-segment restriction would mostly be artificial. It can be done as an intermediate step though, if the second item in #64797 (comment) (which is the current blocker) is going to be (partially) implemented by someone unfamiliar with resolve.
I read that message, and what I was wondering was whether that second item would be substantially simpler to implement and stabilize if it doesn't have to take trait methods or any other type-dependent resolution into account.
@joshtriplett It already doesn't have to account for type-dependent resolution, it just has to report an error if type-dependent resolutions are potentially possible.
Accounting for type-dependent resolutions would require emitting some new kind of where clauses resolved during type checking, like
#[cfg(accessigle(Type::dependent))]
struct S { ... }
=>
// Name `S` is still defined, but referring to it is an error if the where predicate ends up being false during type checking, similarly to false bounds with `#![feature(trivial_bounds)]`.
struct S where exists(Type::dependent) {
...
}
A really amazing use case for this IMO is -sys crates For example: libc
, currently i have code like this in a current project that does some unixey nonsense:
This one is attempting to replicate #[cfg(accessible(libc::getloadavg))]
#[cfg(any(
target_os = "macos",
target_os = "ios",
target_os = "linux", // android?
target_os = "netbsd",
target_os = "openbsd",
// target_os = "freebsd", ??
// TODO: verify
))]
This one is attempting to replicate #[cfg(accessible(libc::sysctlbyname))]
#[cfg(any(
target_os = "macos",
target_os = "ios",
target_os = "freebsd"
// does anybody else have this?
))]
I briefly almost considered having doing an autoconf-style "do a test build to check existence", expose --cfg have_<c function>
.
But more generally: When you consider this in conjunction with performing bindgen, it gets pretty powerful, and makes it vastly easier to support C apis that gain functionality over time (for example, I'd love to use that inside rusqlite as well).
Yup, that's my sorta use case as well: the _os
module in RustPython has a bunch of wrappers around libc functions, and it'd be really great if we could just do #[cfg(accessible(libc::foo))] fn os_foo(x: i32) -> i32 { libc::foo(x) }
, rather than copying cfg()
s from libc that might become inaccurate later. Also for libc constants: it'd be really nice if we could just create a map of String->i32
of constants just by passing the names of all the ones that might exist to a macro.
There is a bit of discussion about cfg(accessible)
in the cfg(version) tracking issue: https://github.com/rust-lang/rust/issues/64796#issuecomment-811953979 , see especially the proposal for a more limited and forwards compatible version of cfg(accessible)
: https://github.com/rust-lang/rust/issues/64796#issuecomment-844650096
Re the forward compatibility concern with type dependent paths outlined in https://github.com/rust-lang/rust/issues/64797#issuecomment-540610042
We can report an error if the path in cfg is potentially type-resolved, that kinda reduces the usefulness of the feature though.
It seems to me that in order to support types at all, one has to parse pub enum SomeEnum
anyways, right? It wouldn't be hard to record that SomeEnum
is a type and thus anything that comes after SomeEnum
must be a type dependent path. So we could error here, no? A name overlap of a module and a type can't occur either as they live in the same namespace.
I would be quite satisfied with a std/alloc/core version of this for the time being, yes.
Solving it only for libstd/core/alloc solves the issue for people trying to support old versions of rustc, but I'd argue that's a less common use case than people trying to use conditionally-available functionality in other cases.
As mentioned, it's pretty annoying to use libc without this.
@thomcc I'd absolutely like to see this available for general-purpose use. However, in the short term, the question is "how much of this needs to be available so that people aren't driven towards rustc version detection instead". And for that purpose, it's sufficient to be able to see what's available in the libraries that ship with rustc.
@joshtriplett oh by initially supporting std plus prelude you meant builtin prelude, not the 2018 crates prelude (crates passed via --extern param). Note that supporting the latter isn't any harder really than supporting the builtin prelude only. The problem only arises with modules that are part of the same crate, or stuff like proc_macro that needs an extern crate statement first, as those can be gated by their own cfg statements.
I wonder if it would be possible to get mentorship instructions written for this? I suppose in general there are four tasks: changing the feature to reject crates not from the builtin or extern prelude, splitting of the restricted feature as a separate feature gate so that it can be stabilized earlier, rejecting any path where there is a type that's not at the final position with an error, implementing the actual path searching feature.
https://github.com/rust-lang/rust/pull/97391 improves the implementation to more usable state.
One particular pain point I think the above still doesn't address: this currently doesn't work inside macros, since it receives meta variables unresolved. Not sure whether you'd classify this as a complete blocker of stabilisation or not.
@clarfonthey That could potentially be handled by indirecting through another level of macro to force expansion.
It sounds like cfg_accessible
can do quite a bit now, and can see any item, just not methods or similar.
That seems potentially useful enough to consider stabilizing its current functionality.
(Is there any reason we can't name this cfg(accessible(...))
?)
(Tagging as impl-incomplete, but that shouldn't block us from stabilizing what's there.)
libs-api: does checking for top-level items seem sufficiently useful to motivate stabilizing this?
We didn't get to this in today's @rust-lang/libs-api meeting, but several individual folks from the team looked at it and enthusiastically expressed support for having it. The capabilities it already has would solve many issues for us.
Is there any reason we can't name this
cfg(accessible(...))
?
We potentially can, but it will need to expand to something like #[cfg_accessible]
anyway and won't behave like other cfg
s.
Macro attributes can be enqueued and retried when their expansion fails (and ability to do that is crucial for cfg_accessible
), but regular cfg
s need to be able to expand immediately.
Path resolutions in cfg_accessible
are not currently shielded against "time travel", unlike paths in macros and imports.
It would be good to implement such shielding before the stabilization, otherwise it can potentially cause some breakage later on, although it shouldn't be large. I don't currently have time to work on this, but I can point interested people to the relevant code, please write me a message on Zulip if you are interested.
I'm not in favour of calling it cfg(accessible(...))
unless it can actually work like a cfg
would. Not 100% sure on what the actual requirements for that are, though.
If it's not cfg(accessible)
, it would be much more painful to use with other cfgs, since you need to do things like #[cfg_attr(any(foo, bar), cfg_accessible(...))]
, which kind of sucks.
Right, the main reason for cfg(accessible)
would be to allow combining with any
or all
or similar.
An any/all free cfg(accessible)
would unlock the sys crate use case: https://github.com/rust-lang/rust/issues/64797#issuecomment-706286311
And I think many more use cases don't need any combinators.
Also, one can support all
at least, by making the cfg(accessible)
feature desugar cfg(all(foo, bar, all(accessible(a), baz), accessible(b)))
to cfg_attr(all(foo, bar, baz), cfg_accessible(a, b))
(pending compiler support for cfg_accessible(a, b)
). foo, bar and baz could even contain any and all and such, as long as the criterion is upheld that any cfg(accessible)
is reachable from the root of the cfg tree by only visiting all
nodes (and no not
or any
ones). One could relax the criterion even further to "the clauses in the conjunctive normal form representation of the tree containing cfg(accessible)
are naked atoms". This would allow stuff like cfg(not(any(foo, not(accessible(a)))))
.
With all
supported, I wonder which use case would need any
or not
. Generally you do cfg(accessible)
right before using a feature you tested accessibility for, so if it evals to false then you would likely not want to use the feature.
not
would be needed if you were doing something like providing a polyfill when the feature is not available.
Yeah, not
seems extremely important to me, and hard to work around its absence. Lack of any
/all
could definitely be worked around, though, just it would be tedious.
A really amazing use case for this IMO is -sys crates For example:
libc
, currently i have code like this in a current project that does some unixey nonsense:
I'm not sure I agree that's a good use case for this feature. Using cfg(accessible) for that quickly results in hard-to-maintain code that changes behavior depending on whether a symbol exists, without any checks for typos or signatures. There would be all kind of very implicit contracts between crates, similar to what we'd have if we relied on duck-typing everywhere instead of well defined traits and types.
What would your suggested alternative be?
Currently, the only alternative I know of is to maintain a large list of cfgs for which targets support (e.g. getloadavg
or sysctlbyname
in the example), and I'm not sure I think that's that's any more maintainable.
Another thing I'm worried about this feature enabling or even encouraging code that depends on future versions of the standard library that don't even exist yet.
For example: If we're experimenting with an unstable type X
in the standard library, someone might reasonably write code like this to allow for a seamless transition for the future if/when std::X
gets stabilized:
#[cfg(accessible(std::X))]
pub type X = std::X;
#[cfg(not(accessible(std::X)))]
pub struct X { .. }
The problem with this 'future compatible' code is that it assumes the future stable std::X
type will be compatible with their own struct X
. If we decide to change the unstable API in some breaking ways before stabilization (which we regularly do), stabilization would break this code, even though they were not using nightly Rust or unstable #![feature]
s.
We could simply blame the author of this code for using cfg(accessible) like this, but if this is part of a popular library, all their users now have a reason to not upgrade to a new version of Rust, since we broke their dependency in the new Rust version.
That's a good point, and might be a point in favor of something like a #[cfg(version(...))]
instead, which seems much harder to abuse in that way (short of making a called shot about when the feature will land, which seems unlikely to happen frequently enough to matter).
That said, I believe this kind of version checking is regarded as undesirable for various reasons as well (although it's widely used in crates with long MSRV requirements).
What would your suggested alternative be?
Currently, the only alternative I know of is to maintain a large list of cfgs for which targets support (e.g.
getloadavg
orsysctlbyname
in the example), and I'm not sure I think that's that's any more maintainable.
Ideally, something that interacts well with our type and trait system, with where-clauses. E.g. an impl GetLoadAvg for System
, such that a function with a where System: GetLoadAverage
can call System::getloadavg()
. Then it can also be type checked properly on all systems, instead of only on those where the cfg()s are enabled. That'd also prevent a lot of the cases where we accidentally break tier 3 platforms.
I realize that this requires some new language features, and probably some more to make it anywhere close to ergonomic. I really dislike conditional compilation, as it results in similar surprises and headaches as e.g. C++ templates (which are only type checked when instantiated) or type issues in dynamically typed scripting languages. I think conditional compilation is unavoidable in some cases, but I don't think we should encourage more of it.
I'm not sure if I see cfg(accessible) as something to just improve existing #[cfg]
s, or as something that encourages even more (and more fragile) #[cfg]
s.
I also have noted in the past that cfg(accessible)
can seem like the right solution to a problem, but passing the wrong argument to accessible might end up with the code not running when expected. It's certainly a problem.
I also agree something like what I suggested probably wouldn't be the right fit for use inside libstd, but I don't think I take as hard of a line against conditional compilation. It does make code harder to maintain, but it also comes with a number of benefits (compile time, for example, even if just in terms of not needing to pull in things like windows-sys
or objc
when building for linux).
I am in favor of ways to avoid the need for it, though, especially inside the stdlib.
(I'm not sure I follow how your suggestion avoids the current situation, but am willing to accept that it may be slightly handwaved given the "this requires some new language features" bit)
This is a tracking issue for
#[cfg(accessible(::path::to::thing))]
(rust-lang/rfcs#2523).Steps
Status
From this comment
#[cfg_accessible(path)] item
. The attribute can configure or unconfigure theitem
and wait until the predicate "path
is accessible" becomes determinate.#[cfg(accessible)]
into#[cfg_accessible]
is not implemented, we need to consider doing or not doing it only when everything else is implemented.Unresolved questions:
None so far.