Open LukasKalbertodt opened 6 years ago
Hey, I created a proc-macro-crate to find the name of a crate, even if it was renamed in Cargo.toml
. This helps for procedural macros and the extern crate
trick.
It's also just annoying to have to make a separate crate for macros, especially given that the predominant style seems to be to reexport them so that the user doesn't need to fuss with managing the dependencies.
No matter what, I suspect this is blocked on proper def-site hygiene support. But in light of the annoyance of making a separate crate, I'll propose another alternative: allow proc macros to be defined in the same crates as regular items.
Obviously this isn't immediately workable, because of the possibility of circular dependencies. So we would need some kind of multi phase compilation. But I think, without much knowledge of the details, this might be viable without being too intrusive to the compiler. Specifically I propose the following:
macro
and final
.phase
attribute available to enable conditional compilation. It accepts a comma-separated list of phases and is inherited through scopes unless overridden, with a default of#[phase(final)]
. But there can also be a phase
key for cfg
in order to cover use cases that the new attribute doesn't.
phase
attribute and can't have an explicit phase
marking of their own. They always declare a macro into their immediate scope (in the macro namespace of course), but exact behaviour depends on phase.
macro
phase, the declared macro is simply an error to call, and the definition is used to compile the macro. The placeholder macro exists to produce better error messages than "name doesn't exist" and to avoid accidentally invoking a macro that you didn't realize was imported from a glob import and that would have been shadowed. (Note: this can probably be done with minimal compiler support by having the macro
phase version be a normal macro that just expands into a compiler_error!
invocation?)final
phase, the proc macro definition is ignored, and the declare macro name refers to the macro compiled during the macro
phase, as by a use
. The visibility of the macro name is determined by the visibility of the function defining it.final
version of the crate, and therefore can refer to other names in the crate.
proc_macro
crate provides a quote_local_path!
macro, implemented via intrinsic, that accepts a local path and produces a hygienic, implicity-delimited TokenStream
referring to that path. Possibly the quote!
macro could also have a syntax for this.Another problem that arises from this issue that I think hasn't been mentioned here is when the proc-macro is re-exported.
If we have crate a which contains a proc-macro, and a crate b that depends on a and re-exports the macro (with pub use a::my_macro
), then the code that depends on b will not have the ::a
in scope, and this will result in hard-to-troubleshoot issues, since users of b
don't even know about crate a.
Completely forgot about this issue. Basically there were two reasonably significant issues with the approach I wanted to take that would need to be addressed for any solution.
My personal view is that the former is trivially solvable: allow an optional version to be specified. The latter is quite a bit more difficult, and I have no solutions to propose.
@jhpratt
I think I don't understand where the problem is. If a proc macro could use $crate
, it would refer to its accompanying library crate, which is unique.
Is it unique? As far as I understand "accompanying library crate" is not a concept that really exists in rustc or ever cargo. It’s just a convention.
What if the proc macro is re-exported from a third crate? The problem is not as simple as you'd think. What if you want to reference some arbitrary crate? I have a proc macro that needs ::serde
, but that's not my crate.
Something I've long thought about is the ability to explicitly declare runtime-dependencies
for a proc-macro. That would forward-declare a crate that the macro will generate code to reference (and so avoid dependency loops), and somehow give it some TokenTree
through which it can refer to the crate in the generated code.
EDIT: It is possible to support multiple crates even if there is a single $crate
though, you just need to have your runtime crate re-export all the other dependencies you need.
All that's needed is hygiene to ensure that crates are looked up in the context of the macro, not the call site.
On Mon., Sep. 5, 2022, 02:40 Jacob Pratt, @.***> wrote:
What if the proc macro is re-exported from a third crate? The problem is not as simple as you'd think. What if you want to reference some arbitrary crate? I have a proc macro that needs ::serde, but that's not my crate.
— Reply to this email directly, view it on GitHub https://github.com/rust-lang/rust/issues/54363#issuecomment-1236593621, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE7AOVJHOK4NDZCV5GQJEHDV4WIU3ANCNFSM4FWCMO4Q . You are receiving this because you commented.Message ID: @.***>
It's not just hygiene. The proc-macro is built for a different target, so it will have to do something similar to how a crate depending on a proc-macro crate implicitly creates a different kind of dependency edge. And having it just naïvely depend on the runtime crate creates dependency loops if the runtime crate then depends on the macro crate to re-export it. That's why I think it needs some way to do forward declaration of dependencies between cargo and rustc for these "dependencies, but not really".
Yeah, that's the hard part. The name lookup is the part that is not hard.
So, runtime-dependencies
would create a pseudo-crate which depend on all the runtime-dependencies
and make $crate
output by the macro refer to it? That sounds nice, although the pseudo-crate will still need special handling all the way down, since nothing can depend on it (otherwise there will be a cycle if it depends on something).
Another way is to move macro definition to the normal crate. A hacky idea that comes to mind is something like
reexport_macro_setting_dollar_crate_to_here! { the_crate_macros::the_macro }
i.e. add a built-in macro that creates a new macro setting $crate
to the current crate. This also solves
To some extent this can be simulated even on stable, I think:
// main crate, that reexports the macro
macro_rules! the_macro {
// `$crate` is expanded early it seems, and can be reinterpreted as a path
// I've tested with `macro_rules!` and this still works if this macro is used from outside
($($args:tt)*) => { $crate::the_crate_macros::the_macro!($crate; $($args)*) }
}
// macro crate
#[proc_macro]
pub fn make_answer(ts: TokenStream) -> TokenStream {
let krate = ts.parse_path(); // pseudo-code
_ = ts.parse_semicolon();
// ...
}
But, this is very limited -- for attribute and derive macros this won't work, as there is no syntax to define them in the normal crate.
A less hacky way would require to define macro sub-crates in-tree, which I think was discussed in zulip. But that's a lot bigger feature I think.
To some extent this can be simulated even on stable, I think
I can confirm this works just fine for expression macros, I use it in stylish
to get a re-exportable format_args!
proc-macro. (It doesn't actually get expanded early, it gets passed in as a $crate
ident which obeys its hygiene to determine which crate it refers to, you can get arbitrary crate access from one $crate
token by changing its span).
Yes, this is a solution for proc macros, but it doesn't work for derive macros, does it ?
But, this is very limited -- for attribute and derive macros this won't work, as there is no syntax to define them in the normal crate.
But rustc could do a similar(-ish) trick for them, I think. We "just" need to design and implement it.
https://github.com/rust-lang/rust/issues/54363#issuecomment-1236685409 is along the lines of what I would want in pretty much all of my macro libraries. Something like:
# Cargo.toml
[package]
name = "serde_derive"
[lib]
proc-macro = true
[dependencies]
proc-macro2 = "1"
quote = "1"
syn = "1"
[build-dependencies]
autocfg = "1"
[dev-dependencies]
serde = "1"
trybuild = "1"
[macro-dependencies]
serde = "1"
// src/lib.rs
use proc_macro::TokenStream;
#[proc_macro_derive(Serialize)]
pub fn derive_serialize(input: TokenStream) -> TokenStream {
let serde /*: proc_macro::Ident */ = proc_macro::dependency("serde");
quote! {
impl #serde::Serialize for …
}
}
In terms of the Cargo build graph, this does not say serde
needs to finish (or even start) building before serde_derive
can start building, unlike ordinary dependencies
. It says serde
needs to finish building (the rmeta, not necessarily codegen) before anything that depends on serde_derive can begin building, except serde itself:
If serde
depends on serde_derive
and calls this macro (it doesn't, but let's pretend) then the Ident
that gets returned by proc_macro::dependency("serde")
inside that expansion needs to behave just like a $crate
that came from a macro_rules inside serde would behave.
If some other crate depends on serde_derive
(directly or transitively) and calls this macro, the Ident
is as though the downstream crate had its own direct dependency on __unnameable = { package = "serde", version = "1" }
and obtained a $crate
from it.
The discussion above about "what if multiple versions" and "what if different registries" doesn't seem applicable to this solution. The macro-dependencies
describes a particular version just as a dependency
or dev-dependency
would do, and implicitly or explicitly a registry, and integrates nicely with Cargo patch
. For example [patch.crates-io] serde = { path = "…" }
would apply to that macro-dependency
exactly as it would apply to an ordinary dependency
.
@Nemo157 @dtolnay Love it. Both questions/problems I stated are inherently resolved by using Cargo.toml
, which is something I honestly never considered.
Based on this discussion over def_site in proc macros. https://github.com/rust-lang/rust/issues/54724#issuecomment-867953306
Would it be appropriate for @dtolnay's macro-dependencies suggestion to put the dependencies into the def_site namespace?
Note that the same problem occurs in build script dependencies. prost
/prost-build
, tonic
/tonic-build
, configure_me
/configure_me_codegen
... I think both need to be solved and probably doing it the same way is the simplest option.
The problem
In macros-by-example we have
$crate
to refer to the crate the macro is defined in. This is very useful as the library author doesn't have to assume anything about how that crate is used in the user's crate (in particular, the user can rename the crate without breaking the world).In the new proc macro system we don't seem to have this ability. It's important to note that just
$crate
won't be useful most of the time though, because right now most crates using proc macros are structured like that:foo-{macros/derive/codegen}
: this crate isproc-macro = true
and defines the actual proc macro.foo
: defines all runtime dependency stuff, hasfoo-{macros/derive/codegen}
as dependency and reexports the proc macro.foo
An example:
So an equivalent of
$crate
would refer to thefoo-{macros/derive/codegen}
crate which is not all that useful, because we mostly want to refer tofoo
. The best way to solve this right now is to use absolute paths everywhere and hope that the user doesn't rename the cratefoo
to something else.The proc macro needs to be defined in a separate crate and the main crate
foo
wants to reexport the macro. That means thatfoo-macros
doesn't know anything aboutfoo
and thus blindly emits code (tokens) hoping that the cratefoo
is in scope.But this doesn't sound like a very robust solution.
Furthermore, using the macro in
foo
itself (usually for testing) is not trivial. The macro assumesfoo
is an extern crate that can be referred to with::foo
. But that's not the case forfoo
itself. In one of my codebases I used a hacky solution: when the first token of the macro invocation is*
, I emit paths starting withcrate::
instead of::foo::
. But again, a better solution would be really appreciated.How can we do better?
I'm really not sure, but I hope we can use this issue as place for discussion (I hope I didn't miss any previous discussion on IRLO).
However, I have one idea: declaring dependencies of emitted code. One could add another kind of dependencies (apart from
dependencies
,dev-dependencies
andbuild-dependencies
) that defines what crates the emitted code depends on. (Let's call thememit-dependencies
for now, although that name should probably be changed.) So those dependencies wouldn't be checked/downloaded/compiled when the proc macro crate is compiled, but the compiler could make sure that those dependencies are present in the crate using the proc macro.I guess defining those dependencies globally crate is not sufficient since different proc macros could emit code with different dependencies. So maybe we could define the
emit-dependencies
per proc macro. But I'm not sure if that makes the check too complicated (because then Cargo would have to check which proc macros the user actually uses to collect a set ofemit-dependencies
).That's just one idea I wanted to throw out there.
Related
serde
issue related to this