rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.39k stars 12.59k forks source link

Tracking issue for `proc_macro::Span` inspection APIs #54725

Open alexcrichton opened 6 years ago

alexcrichton commented 6 years ago

This issue is intended to track a number of unstable APIs which are used to inspect the contents of a Span for information like the file name, byte position, manufacturing new spans, combining them, etc.

This issue tracks the proc_macro_span unstable feature.

Public API

Already stabilized:

impl Span {
    pub fn source_text(&self) -> Option<String>;
}

impl Group {
    pub fn span_open(&self) -> Span;
    pub fn span_close(&self) -> Span;
}

To be stabilized, probably in their current form:

impl Span {
    pub fn line(&self) -> usize;
    pub fn column(&self) -> usize;

    pub fn start(&self) -> Span;
    pub fn end(&self) -> Span;
}

To be stabilized after some (re)design or discussion:

impl Span {
    pub fn source_file(&self) -> SourceFile;

    pub fn byte_range(&self) -> Range<usize>;
}

#[derive(Clone, Debug, PartialEq, Eq)]
pub struct SourceFile { .. }

impl !Send for SourceFile {}
impl !Sync for SourceFile {}

impl SourceFile {
    pub fn path(&self) -> PathBuf;
    pub fn is_real(&self) -> bool;
}

Things that require more discussion:

impl Span {
    pub fn eq(&self, other: &Span) -> bool;
    pub fn join(&self, other: Span) -> Option<Span>;
    pub fn parent(&self) -> Option<Span>;
    pub fn source(&self) -> Span;
}

impl Literal {
    pub fn subspan<R: RangeBounds<usize>>(&self, range: R) -> Option<Span>;
}
softprops commented 5 years ago

I wanted to shed some pain this is causing with tarpaulin.

Tarpaulin has worked amazingly well for me and my company as a replacement for kcov which historically been a pain to get accurate, reliable and correct results. At the moment tarpaulin stands as the most promising goto option for codecoverage in rust and just feels more like the only first class tooling option.

Having one of those when choosing to invest in a technology is important for many company's adoption story for checking off code quantity checkmarks. When they see that rust doesn't have a reasonable code quality story that works on stable rust, that's can result in a "pass" rather than "I'll take it". There are currently some work arounds for making this work-ish on stable but it feels very much like the story serde was in a year or so ago when you wanted to show all your friends how amazing serde was but then was embarrassed to show what it took to make work on stable because of a macro stabilization blocker.

JakeTherston commented 5 years ago

With procedural macros having reached a point where they're very useful on stable, I expect many users will find themselves needing access to this information. Would it be reasonable to only stabilize parts of the Span API that are not too risky? Perhaps exposing a function that optionally returns the path of the file where a macro is invoked if such a file exists?

matklad commented 5 years ago

I have a concern about exposing LineColum information. It looks like it could play badly with incremental compilation, especially in the IDE context.

My understanding is that, if one adds a blank line to the start of the file, the line_column information of the input spans to proc macro changes. That means that IDE would have to re-expand procedural macros even after insignificant white space changes.

I would feel more confident if proc-macros were strictly a pure function from the input token stream to output token stream. This can be achieved, for example, by making line-column infocation relative to the start of macro invocation (as opposed to relative to the start of the file).

I don't feel that exposing absolute position is necessary the end of the world: IDE can track if a macro actually makes use of the info, or it can supply macro with some fake line/columns. But it's hard to tell if such hacks would work well in practice, and it's hard to experiment with them for the lack of IDE which handled proc-macros incrementally....

davidlattimore commented 5 years ago

If the parser allocated IDs to every AST node, then, and this is the hard part, when an edit was made to the source, the parser tried to keep those IDs the same in all the non-edited code and only allocate new IDs for new nodes, that would allow spans to be kept completely separate from the AST. Those IDs could be passed through macro expansion without causing unnecessary invalidations. If something needed a span later on, it could then go back and ask the parser for the span for that particular AST node ID. I feel like having an incremental parser is important, not because parsing is the bottleneck, but because it underpins everything else.

matklad commented 5 years ago

@davidlattimore this is fascinating, but slightly off-topic for the issue. I've created a thread on internals: https://internals.rust-lang.org/t/macros-vs-incremental-parsing/9323

est31 commented 5 years ago

The column!() macro as well as std::panic::Location::column are returning 1-based columns while the span available from the proc-macro crate is 0-based according to its docs. Is this inconsistency intended?

est31 commented 5 years ago

This thread has more discussion about 1-based columns: https://github.com/rust-lang/rust/pull/46762#issuecomment-352474639

est31 commented 5 years ago

Another open question is how this API relates to https://github.com/rust-lang/rust/issues/47389 which is about minimizing span information throughout the compiler. Should stabilization be blocked until a design for #47389 is found? Is it too late already as we have set_span functionality? @michaelwoerister what do you think?

michaelwoerister commented 5 years ago

47389 is mostly concerned about data types that are used later in the compilation pipeline, such as type information and MIR. Exposing things at the proc-macro level should not be too bad.

est31 commented 5 years ago

But rust-analyzer might one day expand the scope of incremental compilation to the parsing stage, right?

matklad commented 5 years ago

That is already true for rust-analyzer, macro expansion stage runs incrementally. We do’t yet support line/column macro, but I think what we would do is that we’ll just expand them to a dummy value like 42. This probably won’t break macro (It could be this line indeed), but will allow us to avoid re-expanding macro after every keystroke

est31 commented 5 years ago

What we could do is store for each macro invocation whether it has invoked any of the inspection APIs or not and then re-run the macro only if any of the span information it has requested has had any changes. Basically only adding the infos the requested spans to the hash (or only adding span info to the hash if there's been any request of span info). However, as this has to update state, it would make span inspection slower than it could be, especially if something inspects spans routinely. So not sure about the performance impacts of this.

jhpratt commented 5 years ago

@alexcrichton @matklad Any update on what blockers are for this issue? It doesn't seem like there should be anything holding this back.

est31 commented 5 years ago

@jhpratt my question about 1-based vs 0-based is still not answered: https://github.com/rust-lang/rust/issues/54725#issuecomment-497246753

jhpratt commented 5 years ago

Hm, that's a fairly simple one. My intuition would certainly lean towards 1-based.

syntacticsugarglider commented 4 years ago

Is this still blocked on questions with regard to the inconsistency between two base indices?

Alxandr commented 4 years ago

I just thought I'd chime in here, cause I just ran into an issue where I would like to be able to map "place in code" to something else. Basically, I don't care about files or columns or indexes or anything, I just need something that's Eq + Hash, and that I can get 2 from (start and end) for a Span (also that it follows that any given token t has the same end as the start of the next token). This could just be a usize that's local to the current macro invocation and doesn't have to mean anything to the broader world, but it gives the ability to sort and map on "code points".

de-vri-es commented 4 years ago

I'd love to see this stabilized too. In particular, assert2 uses Span::source_text() to show the expression that caused an assertion failure:

On stable it falls back to regular stringification, but that certainly looks less pretty.

Aaron1011 commented 4 years ago

Note that Span::parent allows you to write 'completely unhygenic' macros, which can see identifiers in a way that would otherwise be impossible:

// src/lib.rs
#![feature(proc_macro_span)]

use proc_macro::TokenStream;

#[proc_macro]
pub fn no_hygiene(tokens: TokenStream) -> TokenStream {
    tokens.into_iter().map(|mut tree| {
        let mut span = tree.span();
        while let Some(parent) = span.parent() {
            span = parent;
        }
        tree.set_span(tree.span().resolved_at(span));
        tree
    }).collect()
}
use no_hygiene::no_hygiene;

macro_rules! weird {
    () => {
        println!("{}", no_hygiene!(my_ident));
    }
}

fn main() {
    let my_ident = "Hello";
    weird!();
}

Here, weird!() is able to access my_ident via no_hygiene!. This is already possible with just Span::resolved_at in some circumstances - if a single token has a Span with the desired hygiene, a proc-macro can construct arbitrary tokens with the desired hygiene. However, Span::parent allows hygiene to be completely bypassed - a proc-macro can see into a particular scope that it would otherwise be completely isolated from.

Using Span::parent may be useful for the purposes of emitting error messages, but I'm worried about macros generating working code that intentionally uses this. This seems like an easy way to end up with code that has confusing non-local effects.

Aaron1011 commented 4 years ago

We might want to gate 'read-only' APIs (e.g. Span::source_text, Span::start) under a separate feature, since they could be stabilized without increasing the power of proc macros w.r.t hygiene.

udoprog commented 4 years ago

We might want to gate 'read-only' APIs (e.g. Span::source_text, Span::start) under a separate feature, since they could be stabilized without increasing the power of proc macros w.r.t hygiene.

Came here to advocate for just this. I have a use case where I require access to the start/end line/column information of spans. And it would indeed be nice if they could be decoupled from the wider feature for an easier path towards stabilization.

udoprog commented 4 years ago

So I'm looking into what's needed to get the following stabilized in libproc_macro:

There are two outlined concerns that I've been able to digest so far. If someone has more, I'd love to hear them.

1-based columns

The corresponding functions and LineColumn structure in proc_macro2 are already marked as stable with the same 0-column specification. But as pointed out in alexcrichton/proc-macro2#237, while they are available they only supply dummy values on stable right now. But if desired, I'd be happy to put in the work to modify the reported column information to be 1-based instead. I personally don't have a strong opinion either way. The manner in which I've used spans in genco (and why I now have a desire for them to be stabilized :)) only cares about relative spans.

reported spans affecting the output of proc-macros

Given that proc-macros today can execute arbitrary code, and this seems like something we just have to live with. I don't know how beneficial limiting the existing proc_macro API surface with the goal of making macro execution as reproducible as possible is at this point. I personally wouldn't want to block on this. But for my specific use-case, changing the spans to be reported in a manner which is local to the macros themselves would also give me what I need. So if desired, I'd be for implementing this change.

To clarify and IIUC, that means this assuming 0-based columns:

/// A comment
macro! { hello // start {column: 8, line: 2}
    world // start {column: 4, line: 3}
};

Would instead be reported in the proc-macro as this:

/// A comment
macro! { hello // start {column: 1, line: 1}
    world // start {column: 4, line: 2}
};
matklad commented 4 years ago

reported spans affecting the output of proc-macros I don't know how beneficial limiting the existing proc_macro API surface with the goal of making macro execution as reproducible as possible is at this point.

I don't think there are any concerns about reproducibility, there are concerns about incremental compilation which is something completely different. I think its important that we get opinion of some incremental compilation expert from t-compiler about this (cc @wesleywiser? not really sure whom to CC).

Are we ok with exposing absolute position information to macros? AFAIUI, that means that either:

udoprog commented 4 years ago

I don't think there are any concerns about reproducibility, there are concerns about incremental compilation which is something completely different.

So when I phrase it as the output of a macro being reproducible, that is what I'm referring to. Given a collection of inputs it would produce an identical output which would allow it to be cached during incremental compilation.

But to my wider point, a proc-macro can generate random numbers, access the filesystem, or connect to a database (the example I linked to). I'd be very curious to hear about plans for incremental compilation, but given these facts I have a hard time envisioning an approach that allows us to make decisions about re-expanding beyond some form of well understood best-effort heuristic.

est31 commented 4 years ago

But to my wider point, a proc-macro can generate random numbers, access the filesystem, or connect to a database (the example I linked to).

That's possible for now, but it's not a documented feature that proc macros are unsandboxed and it's likely that wasm based macros won't get that capability.

To the general question of incremental compilation see also https://github.com/rust-lang/rust/issues/54725#issuecomment-502932263

wesleywiser commented 4 years ago

@matklad While I do know a bit about the incremental system, I wouldn't consider myself an expert. I believe @pnkfelix knows a fair bit more than I.

CreepySkeleton commented 4 years ago

I'm specifically interested in Span::join that would allow for painless proc-macro error messages on stable. There's the hack that makes it already possible, but in order to exploit it, one has to carry around two spans: the span to the first token and the span to the last token.

If join were available, syn could do the joining in its ast.span() methods and the two-spans trick would become easier to pull off since the joined span had already covered the whole range.

Aaron1011 commented 4 years ago

I'm specifically interested in Span::join that would allow for painless proc-macro error messages on stable

Before Span::join is stabilized, we should decide how it should handle hygiene. If I write first.join(second) should the returned Span use Span::resolved_at(first) or Span::resolved_at(second)? Or should we change the return type to Result<Span, JoinErr>, and have enum JoinError { DifferentFile, DifferentHygiene }?

SergioBenitez commented 4 years ago

@Aaron1011

Note that Span::parent allows you to write 'completely unhygenic' macros, which can see identifiers in a way that would otherwise be impossible:

Not to advocate in either direction, but this is already possible today, without Span::parent():

// main.rs
mod submod {
    macro_rules! make_local_weird {
        ($local:item) => (
            #[macro_export]
            macro_rules! weird {
                () => (
                    println!("{}", export! { $local private })
                )
            }

            pub use weird;
        )
    }

    make_local_weird!(fn f() {});
}

pub fn main() {
    let private = "hi";
    submod::weird!();
}
// lib.rs
use proc_macro::TokenStream;

use syn::spanned::Spanned;
use syn::{parse_macro_input, Result};
use syn::parse::{Parse, ParseStream};
use quote::quote;

struct Export {
    item: syn::Item,
    ident: syn::Ident,
}

impl Parse for Export {
    fn parse(input: ParseStream) -> Result<Self> {
        Ok(Export {
            item: input.parse()?,
            ident: input.parse()?,
        })
    }
}

#[proc_macro]
pub fn export(input: TokenStream) -> TokenStream {
    let input = parse_macro_input!(input as Export);
    let mut ident = input.ident;
    let span = ident.span().resolved_at(input.item.span());
    ident.set_span(span.into());
    quote!(#ident).into()
}
jonas-schievink commented 3 years ago

Are we ok with exposing absolute position information to macros? AFAIUI, that means that either:

  • we'll have to re-expand all proc-macros even for trivial changes like addition of blank line.

  • we'll have to record "has this macro looked at line/column?" bit of info, and still re-expand those specific macros for trivial changes.

A third option would be to provide dummy information to proc macros from rust-analyzer. AFAIK, most proc macros use the data they obtain from these APIs only at runtime, and do not really make any decisions based on it at expansion time (and even then, they should work fine if line/column is always 1, unless they're being actively annoying).

udoprog commented 3 years ago

A third option would be to provide dummy information to proc macros from rust-analyzer. AFAIK, most proc macros use the data they obtain from these APIs only at runtime, and do not really make any decisions based on it at expansion time (and even then, they should work fine if line/column is always 1, unless they're being actively annoying).

The macros requiring this information appear to do so for whitespace information (mine does, and so does done of the linked issues above as well). I don't see how this would work without a difference between spans. So being actively annoying might just be a valid use case.

aticu commented 3 years ago

Just to give another example of proc-macros requiring white-space information to function properly: inline-python

Also another use-case I could think of would be generating documentation based on Rust code in the macro input. Without whitespace information, this could look quite bad:

std :: mem :: ManuallyDrop < T > :: drop ( & t ) ;
// vs
std::mem::ManuallyDrop<T>::drop(&t);
call ( & x [ 0 ] ) ;
// vs
call(&x[0]);

Though the second use-case could probably handle invalid line/column data fairly well, if it is only given when executed by an IDE. In the worst case it would possibly just generate unreadable documentation that isn't used by the IDE.

jan-hudec commented 3 years ago

Since these examples only need relative position of the tokens, the fourth option is to provide the information, but count lines from the start of the macro call only (so the first token in the token stream always has line = 1). That way if whitespace changes inside the macro call it needs to be re-expanded, but if anything changes before it it does not.

m-ou-se commented 3 years ago

A third option would be to provide dummy information to proc macros from rust-analyzer. AFAIK, most proc macros use the data they obtain from these APIs only at runtime, and do not really make any decisions based on it at expansion time

ct_python!{} executes Python code at compile time. Changing the span information could change the meaning of the Python code. Though one could definitely argue that this whole crate is a horrible idea in combination with RLS/rust-analyzer (or just in general ^^'). (With RLS, it executes the Python code while you are typing it. :grimacing:)

Maybe it'd be useful if a proc macro can tell it is being executed by a language server instead of the compiler? Not sure what it could do better in that case though, because the language server still needs the expansion to do the rest of its job.

jonas-schievink commented 3 years ago

We had to rely on Span's Debug impl because the line/column access methods are still unstable: https://github.com/knurling-rs/defmt/issues/344

udoprog commented 3 years ago

So tentatively it seems like the proposed blockers for stabilizing a subset of proc_macro_span as I detailed in https://github.com/rust-lang/rust/issues/54725#issuecomment-647140141 (i.e. min_proc_macro_span) have either been discussed or no one has raised much interest in them (e.g. if spans should be reported macro-relative or be 1 or 0-based).

The biggest discussion seems to be about incremental compilation, so I'd like to take as broad a view on the topic as I can here. I hope I'm not missing something important, but if I do, please do tell!

Incremental compilation of macros as they look today based on assumptions of reproducibility (https://github.com/rust-lang/rust/issues/54725#issuecomment-502932263, https://github.com/rust-lang/rust/issues/54725#issuecomment-647142018) do not seem like they're going to be affected either positively or negatively by stabilization of proc-macro APIs which allows for different outputs depending on where the macro is located in source. As mentioned a few times above (https://github.com/rust-lang/rust/issues/54725#issuecomment-647146117, https://github.com/rust-lang/rust/issues/54725#issuecomment-736371755) the stable parts of the proc-macro system is already "contaminated" with other sources of data that can cause different outputs. This is even the case for the currently stable API of libproc_macro since the output of fmt::Debug prints the absolute byte span of a span in the file.

So it seems like the cat is already out of the bag here. And this might very well be addressed in future proc-macro APIs. If such an API is designed in the future it could also come with a different version of libproc_macro. But "leaky macros" do regardless seem to have legitimate uses, so it seems unlikely that they would ever be fully deprecated.

So I would like to push towards moving forward with stabilization with a minimal version of proc_macro_span which features read-only APIs. In particular parts which have no effect on hygiene (as mentioned here https://github.com/rust-lang/rust/issues/54725#issuecomment-640169091).

afetisov commented 3 years ago

Having absolute line&column positions will also break autoformatting. While sensible formatting of macros is mostly impossible, since they don't need to follow the Rust syntax, at the very least we can currently assume that whitespace has no semantic meaning. This means that we can add whitespace around operators, format inner blocks if we can prove that they are valid Rust code, or make macro calls properly indented. If a macro knows its absolute position, then simply adding or removing indentation is enough to arbitrarily change its behaviour.

Granted, it's a very minor point compared to a loss of hygiene or incremental compilation, but it's still an issue.

Any kind of tooling which assumes that whitespace has no meaning will be screwed by this feature. Note that the promise of no meaningful whitespace is currently contained in the reference manual. Quote:

A Rust program has identical meaning if each whitespace element is replaced with any other legal whitespace element, such as a single space character.

And according to the reference on macros, the macros receive only a delimited token tree, so the whitespace is also irrelevant within macros.

So stabilizing this API will be strictly a backwards incompatible change w.r.t. current Rust guarantees. While, strictly speaking, the usual guarantees are with respect to code compiling and having the same behaviour, I believe that breaking compatibility with tools isn't a thing which should be taken lightly.

droundy commented 3 years ago

I'll like to point out that procmacros can already parse the source code to find where they might have been invoked, and can do any number of bad things that will cause changes to users code to break under various circumstances. Making it more convenient for them to break tooling does not itself break said tooling. Any user of a proc macro is already relying on the author of that macro to make it do reasonable things.

Getting reasonable debug information in cases where macros fail is a huge win, and I'd hate got that to be lost due to some concern that macro writers will turn evil if we give them more power.

afetisov commented 3 years ago

You're stating it as if it's a hypothetical problem, while there are already crates which do exactly the position-dependent processing that I'm talking about. Do remember that this can cause issues for downstream users which may not even know about such features, much less use them directly. I shouldn't know which proc macros are called by my transitive dependencies, but if a macro that I use calls one then my IDE performance will suddenly plummet.

If it's only about debug information, then surely there are better ways to provide it than directly exposing source code layout. Position-independent relative offsets solve many of the problems: if I'm forbidden to offset outside of the containing macro call, then I probably can't violate hygiene, and one doesn't need to reparse the whole crate if some formatting changes. Another option is to make line&column information into opaque objects rather than just integers. You can print them, you can store them, maybe even do some simple calculation like comparison and calculating relative offset, but you can't just make one out of thin air or extract forbidden absolute positions.

Come to think of it, why would it ever be a good idea to allow proc macros to know absolute offset and to create arbitrary spans? Surely a macro has no legal way to know the contents of the surrounding code, so why would it be granted access to it in the first place? The only thing that a user of the macro sees is the code that they directly typed inside of it, and spans in that exact code should be the only thing that a macro should work with.

A different, always available option is to write a separate source-transforming tools, rather than make those capabilities always provided by the compiler. You can use a format which is related to, but different from the Rust sources. You can use the standard syntactic libraries for writing that tool, and since you're doing all the lexing and parsing, you're free to keep any extra information that you want. For a practical example you can look at the camlp5 preprocessor for the OCaml language. But at the same time no one will accidentally depend on your special syntactic features, and the other tools will know that your format is something that they can't process without special support. I don't think pulling the kitchen sink, the fridge and the whole swimming pool into the language is necessarily a great idea.

jonas-schievink commented 3 years ago

I shouldn't know which proc macros are called by my transitive dependencies, but if a macro that I use calls one then my IDE performance will suddenly plummet.

IDE performance would only be minimally affected if these Span APIs are implemented properly. You can already make IDE performance arbitrarily bad by just using a slow proc macro, so I don't really think this is a reason to not provide this API at all.

if I'm forbidden to offset outside of the containing macro call, then I probably can't violate hygiene

Hygiene is independent from positional information. We could remove the problematic Span::parent and stabilize the LineColumn-related APIs.

wycats commented 2 years ago

Hygiene is independent from positional information. We could remove the problematic Span::parent and stabilize the LineColumn-related APIs.

This sounds great and would make it possible to reconstruct enough information about whitespace to implement a lot of whitespace-sensitive EDSLs. Any blockers?

dvc94ch commented 2 years ago

Hygiene is independent from positional information. We could remove the problematic Span::parent and stabilize the LineColumn-related APIs.

This sounds good to me. Looks like the reason https://github.com/rust-lang/rust/pull/84331/files was rejected was introducing a new feature flag. I believe if instead those api's had just been made stable, it might have been fine.

cdmistman commented 2 years ago

Since the information is there, I'm wondering if it's possible to expose the byte offsets of a span within a file.

For example, given this (very simple and dumb) proc macro (using syn and quote for convenience)

use quote::quote;
use quote::ToTokens;
use syn::parse_macro_input;
use syn::spanned::Spanned;
use syn::ItemFn;
use syn::Stmt;

#[proc_macro_attribute]
pub fn instrument(
    _attrs: proc_macro::TokenStream,
    input: proc_macro::TokenStream,
) -> proc_macro::TokenStream {
    let mut input = parse_macro_input!(input as ItemFn);
    let span = input.span();
    let span = format!("{:?}", input.span());
    input
        .block
        .stmts
        .push(syn::parse2::<Stmt>(quote! { println!("span is {}", stringify!(#span)); }).unwrap());
    input.to_token_stream().into()
}

on the following input:

#[proc_macro::instrument]
fn main() {}

the output of the resulting binary is:

span is "#0 bytes(27..39)"

However, those byte offsets aren't accessible in the API. Is it possible to expose those byte offsets via getters? Maybe some fn byte_offsets(&self) -> Range<usize>?

nikomatsakis commented 2 years ago

Hi all --

We read over this in @rust-lang/lang backlog bonanza today and had a few questions. There seems to be a broad set of APIs covered under this feature, so I think it would be useful to identify what subset we need to enable particular use cases.

Reviewing the thread, the major use case seemed to be reporting diagnostics (e.g., this comment about assert2). If that's the case, it seems like exposing

would be sufficient to cover these cases. All of these have the property that one must actually edit the contents of the macro to invalidate them -- though source-text is distinct from 'pure tokens'.

Are there use cases that would be not be handled by the APIs above? (And which ones require relative information?)

If not, maybe we should break out this subset and discuss there?

dannymcgee commented 2 years ago

Hi all --

We read over this in @rust-lang/lang backlog bonanza today and had a few questions. There seems to be a broad set of APIs covered under this feature, so I think it would be useful to identify what subset we need to enable particular use cases.

Reviewing the thread, the major use case seemed to be reporting diagnostics (e.g., this comment about assert2). If that's the case, it seems like exposing

  • source_text
  • nicer diagnostic APIs (separate tracking issue, I think)
  • and maybe line/column information relative to the start of the macro

would be sufficient to cover these cases. All of these have the property that one must actually edit the contents of the macro to invalidate them -- though source-text is distinct from 'pure tokens'.

Are there use cases that would be not be handled by the APIs above? (And which ones require relative information?)

If not, maybe we should break out this subset and discuss there?

I had a use case a while back that really would have benefitted from knowing the path to the source file where the macro invocation occurred. I was trying to write something like a custom module resolver — basically, I wanted to be able to "import" a source file written in some other language (like a scripting or templating language), and the macro would facilitate transpiling the import to valid Rust code. But being unable to specify relative import paths made it prohibitively inconvenient/unergonomic for my liking.

However, that use case would also require invalidating the macro whenever the contents of the imported file changed, so it sounds like it would be problematic for plenty of other reasons besides.

john01dav commented 2 years ago

Hi all -- We read over this in @rust-lang/lang backlog bonanza today and had a few questions. There seems to be a broad set of APIs covered under this feature, so I think it would be useful to identify what subset we need to enable particular use cases. Reviewing the thread, the major use case seemed to be reporting diagnostics (e.g., this comment about assert2). If that's the case, it seems like exposing

  • source_text
  • nicer diagnostic APIs (separate tracking issue, I think)
  • and maybe line/column information relative to the start of the macro

would be sufficient to cover these cases. All of these have the property that one must actually edit the contents of the macro to invalidate them -- though source-text is distinct from 'pure tokens'. Are there use cases that would be not be handled by the APIs above? (And which ones require relative information?) If not, maybe we should break out this subset and discuss there?

I had a use case a while back that really would have benefitted from knowing the path to the source file where the macro invocation occurred. I was trying to write something like a custom module resolver — basically, I wanted to be able to "import" a source file written in some other language (like a scripting or templating language), and the macro would facilitate transpiling the import to valid Rust code. But being unable to specify relative import paths made it prohibitively inconvenient/unergonomic for my liking.

Seconded. I was going to write a similar reply myself -- I had a use case that was almost exactly the same, where I wanted to allow something resembling including a module that's written in another language. This isn't the only use case for a feature like this though. Anything that includes things from the file system can benefit, such as include_dir.

Nemo157 commented 2 years ago

See https://github.com/rust-lang/rfcs/pull/3200 for some more APIs for including files (with some minor discussion about "Span-relative" includes)

450220020 commented 2 years ago

stable version required

madsmtm commented 2 years ago

Another use-case noted above that is quite different from diagnostics is getting a unique identifier to use in link_section/export_names.

I've opened an internals thread for discussing that.

daxpedda commented 2 years ago

@Aaron1011 closed by mistake? Can't seem to find an alternative tracking issue, a PR stabilizing it or a PR removing the feature.