rust-lang / rustfmt

Format Rust code
https://rust-lang.github.io/rustfmt/
Apache License 2.0
5.98k stars 881 forks source link

format macros #8

Open nrc opened 9 years ago

nrc commented 9 years ago

This will be interesting...

nrc commented 9 years ago

The strategy here will be to get the text for the macro, replace every $foo in the body of the macro with a string of the same length, e.g., xfoo. Then re-parse it and format it and convert the xfoos back to $foos (obvs we need to check that there is no xfoo in the body before substituting).

We'll need to format the decl of the macro, and find the body, I suppose, using token trees.

lambda-fairy commented 4 years ago

Google led me here. Is it worth closing this issue now, since rustfmt does have basic macro support?

EDIT: rustfmt has a couple heuristics, but it doesn't really format macros properly in general. So this bug should stay open.

entropylost commented 3 years ago

Would it be possible to like annotate the macro like

#[rustfmt(struct_init)]
macro_rules! foo...

where the syntax for foo! would be

foo! {
  foo: "abcd",
  bar: 1,
};

?

calebcartwright commented 3 years ago

Would it be possible to like annotate the macro like

Suppose it could be possible for those defined within the project being formatted, but practicality would be questionable since I imagine rustfmt would have to do an upfront walk of the entire tree to search for such attributed defs. Wouldn't be possible at all for defs residing outside the current project, as rustfmt wouldn't have insight into those attributes.

That bifurcation would result in different formatting in cases like a crate's documentation vs. what consumers would see in the formatted version of the instances where they call those macros.

drbartling commented 3 years ago

Fairly new to rust, and I'm curious, why is formatting macros in rust hard. How does it compare to formatting macros or templates in C++?

lambda-fairy commented 3 years ago

@drbartling A Rust macro defines its own custom syntax. So rustfmt has to either hard-code support for it, or somehow expand it to figure out what it does, or have the developer annotate the macro to teach rustfmt how to format it.

Only the latter two options can work for all use cases. But macros can be imported, so rustfmt will have to be extended to resolve names across modules/crates. This is a big step up from its current approach, where it only looks at a single file in isolation.

timothee-haudebourg commented 2 years ago

@lambda-fairy a fourth intermediate option would be to annotate each macro invocation, instead of the macro definition, with hints on how to format it. I guess most of the time a basic "add/remove 1 tabulation level after {/}" + "insert a new line after a semicolon" could really improve readability even if it is not perfect. I agree it is not as ideal as inferring formatting from the macro definition, but it would be easier to actually implement.

timothee-haudebourg commented 2 years ago

While I'm thinking about it, why not just let the macro invocation formatting untouched? As @lambda-fairy said, a Rust macro defines its own custom syntax, and rustfmt is a Rust syntax formatting tool. Why not let the writer be in charge of formatting what is inside a macro invocation?

entropylost commented 2 years ago

Not all macros define custom syntax, eg println!. It's especially inconveinent when an entire definition of a function is wrapped within a macro to hand-format it.

jquesada2016 commented 2 years ago

While I'm thinking about it, why not just let the macro invocation formatting untouched? As @lambda-fairy said, a Rust macro defines its own custom syntax, and rustfmt is a Rust syntax formatting tool. Why not let the writer be in charge of formatting what is inside a macro invocation?

This is to me a very good idea. But how would such an API look like? How could macro authors (both procedural and declarative) go about instructing rustfmt how to format their syntax?

I think the majority of macros use rust syntax, and there should be a simple attribute that can be attached to the macro definition that tells rustfmt to format invocations as rust code.

For the custom syntax, I don't have a clue how procedural macros could go about doing this. For declarative macros, breaking changes aside, I could see it working by defining the syntax structure within the patterns themselves.

rbtcollins commented 2 years ago

If we just solved the case of a rust syntax embedded in a macro - e.g. the cfg-if crate is a good example of this - it would help developers a lot.

ghost commented 1 year ago

I'm wondering, is there a reason rustfmt can't format simple "known" macro arguments, like expr, block, etc? I have a really simple macro that is literally just a few lines that surrounds a block with an if/else block:

macro_rules! section {
        ($title:literal, $body:expr) => {
        let span_section = trace_span!(target: UI_TRACE_BUILD_INTERFACE, $title).entered();
        let maybe_node = ui.tree_node_config($title).opened(true, Condition::FirstUseEver).push(); // Should be open by default
        if let Some(opened_node) = maybe_node {
            trace!(target: UI_TRACE_BUILD_INTERFACE, "node expanded");
            $body
            opened_node.end();
        }
        else{
            trace!(target: UI_TRACE_BUILD_INTERFACE, "node closed");
        }
        span_section.exit();
    };
}

In this example, $body is a simple block argument, which should be easily parseable since it has to be valid rust. However, rustfmt ignores it, which makes it tricky to work with (when I use it, I have many levels of nesting and it is hard to manually format it every time I change a few lines).

Would it be possible to simply tell rustfmt that a few certain argument types can be formatted, possibly with the ability to annotate the macro with a no_format attribute if it shouldn't be formatted? Would this solve the problem?

Edit

Just saw the mentioned pull #5538, and this doesn't seem to be the case for me. In my case, I am using parens to call, which is quite weird...

ytmimi commented 1 year ago

@Ararem rustfmt doesn't do any macro expansion and therefore doesn't know that $body:expr is going to expand to a block. Most expressions need a trailing ; if they aren't the last item in a block so adding a trailing ; to $body should solve your issue.

macro_rules! section {
        ($title:literal, $body:expr) => {
        let span_section = trace_span!(target: UI_TRACE_BUILD_INTERFACE, $title).entered();
        let maybe_node = ui.tree_node_config($title).opened(true, Condition::FirstUseEver).push(); // Should be open by default
        if let Some(opened_node) = maybe_node {
            trace!(target: UI_TRACE_BUILD_INTERFACE, "node expanded");
-            $body
+            $body;
            opened_node.end();
        }
        else{
            trace!(target: UI_TRACE_BUILD_INTERFACE, "node closed");
        }
        span_section.exit();
    };
}

Adding the ; and then running rustfmt produces:

macro_rules! section {
    ($title:literal, $body:expr) => {
        let span_section = trace_span!(target: UI_TRACE_BUILD_INTERFACE, $title).entered();
        let maybe_node = ui
            .tree_node_config($title)
            .opened(true, Condition::FirstUseEver)
            .push(); // Should be open by default
        if let Some(opened_node) = maybe_node {
            trace!(target: UI_TRACE_BUILD_INTERFACE, "node expanded");
            $body;
            opened_node.end();
        } else {
            trace!(target: UI_TRACE_BUILD_INTERFACE, "node closed");
        }
        span_section.exit();
    };
}
ghost commented 1 year ago

@ytmimi I may have been unclear - it was where I was using the macro that it didn't format correctly, not the macro definition. And regarding the macro expansion, does that mean that rustfmt doesn't know that it's a body block when I use it, either?

If this is the case, how about implementing an approach similar to gene-michaels - where it attempts to parse it as valid rust, and if it does, then it formats it. Or would that also still cause issues?

ytmimi commented 1 year ago

@Ararem it might be better to open up a new issue to discuss your issue in more detail

ghost commented 1 year ago

@Ararem it might be better to open up a new issue to discuss your issue in more detail

@ytmimi I was slightly wrong as to the cause, but will do soon. Apologies for cluttering this issue. Shall I delete my previous comments?

ytmimi commented 1 year ago

@Ararem no worries, and no need to delete

ghost commented 1 year ago

@ytmimi Have created issue #5652

dvc94ch commented 1 year ago

is there a way for proc macro developers to provide rustfmt with a formatting routine?

ytmimi commented 1 year ago

@dvc94ch there is no way for a macro author to tell rustfmt how to format their macro. If you'd like, you can open a new issue to discuss your specific use case in more detail.

jkelleyrtp commented 1 year ago

As far as I know, dioxus fmt is the only project to provide macro formatting.

https://github.com/DioxusLabs/dioxus/tree/master/packages/autofmt

It would be great to allow dioxus fmt to hook into cargo fmt somehow.

GilShoshan94 commented 1 year ago

Would it be possible for a library author to add some kind of instructions for rustfmt on how to format its macros invocation?

Maybe some kind of macro_rules! too but for formating, let's call it rustfmt_rules! defined just after the definition of the macro.

For the instructions I was thinking reusing some of the fragment-specifier from the macros system itself.

For example for tokio::select! it could be:

/// In tokio/src/macros/select.rs

#[macro_export]
#[cfg_attr(docsrs, doc(cfg(feature = "macros")))]
macro_rules! select {
    ...
} 

rustfmt_rules! select {
    $(biased;\n):?
    $($:pat = $:expr$(, if $:expr):? => $:expr,\n)*
    $(else => $:expr):?
}

In the rustfmt_rules! we proceed in order.

First, we can see a capturing group $( ), that is qualified with :?, it would mean an optional group that may not be present. Inside the caturing group, we have a regex biased;\n, so the formating should be just "biased;" and newline.

Next, we have zero or more repetitions group $( )*, inside we have a Rust pattern $:pat follow by " = " and a valid Rust expression $:expr, followed by an optional group (", " + Rust expression), then a litteral " => ", Rust expression, "," and newline.

Last line, we have an optional group "else => " + valid Rust expression.

So this is kind of a reuse of the macro system, but instead of parsing, it would be used to inform rustfmt how to format (litteral, newline, valid Rust code...)

calebcartwright commented 1 year ago

@GilShoshan94 that suggestion was previously made above, and the challenges/rationale against has similarly already been shared

tgross35 commented 1 year ago

What sort of issues would there be with only adjusting spacing around punctuation? Not applying wrapping or anything else to the content, but I think a simple subset might handle the most common mistakes:

// before
macro!(
Baz:qux
foo =>bar;
    [ "quux",corge]);

// after
macro!(
    Baz: qux
    foo => bar;
        ["quux", corge]
);
calebcartwright commented 1 year ago

Contextual reminder: rustfmt operates on the AST, and does not directly work with input files. For macro calls, rustfmt doesn't really directly process the arg tokensteams either; it chucks the tokens back to rustc_parse, and if rustc_parse says those tokens look like some other type of valid Rust syntax (e.g. an expression), then rustfmt is able to apply the associated rules. This is important to keep in mind because it's not a question of "adjusting" things like existing whitespace/indentation in an input file.

I don't think that's behavior we'd want to drop in lieu of more simplified token-by-token processing (even if feasible), because many users/many call sites want the args to be formatted just like regular Rust code. I'm also not sure if/how well those two models could coexist, or even if one could truly define a singular set of by-token rules that would work unequivocally across all macros (my gut says that at a minimum there would be pairs of macros with conflicting needs e.g. what makes sense for html! may be contrary to the needs of some other macro).

TBH I think the only feasible path forward that maintains a cohesive formatting experience with the rest of the code is to have better support for macros that do work with valid-Rust syntax (there's some challenges in the current model that are solvable, e.g. calls with tokenstreams that can be parsed as multiple types of valid syntax, designating/handling specified macros with args that are mostly valid syntax, etc.) and then potentially starting to special case individual macros, preferably with the majority being able to utilize a relatively small set of formatting patterns.

However, I don't expect any changes/improvements on this front for the foreseeable future; we've too little bandwidth and too many cases of valid Rust syntax that rustfmt doesn't yet support which takes priority.

narodnik commented 6 months ago

Why not just add rust formatting for info!() macros? Or macros which are simply like function calls (except using ! in their name).

baxterjo commented 3 months ago

What about a way for individual macro creators to dictate how their macros are formatted? Something like a trait, but specific to cargo fmt that has an optional implementation if the macro creator so chooses to implement it. I could see large framework creators like tokio-rs taking the time to implement this, while smaller crates don't have to.

chipnertkj commented 4 weeks ago

I have opened a discussion thread in the Rust Internals Forum related to this issue. If there is something constructive you could add to the discussion (criticism, possible solutions, concerns, use cases), please head through the link below. https://internals.rust-lang.org/t/discussion-adding-grammar-information-to-procedural-macros-for-proper-custom-syntax-support-in-the-toolchain/21496

calebcartwright commented 4 weeks ago

Want to reiterate that the suggestion to let macro authors define their own formatting asked in https://github.com/rust-lang/rustfmt/issues/8#issuecomment-2197671615 and alluded to again in the IRLO thread noted https://github.com/rust-lang/rustfmt/issues/8#issuecomment-2334077035 has been suggested and responded to multiple times in this thread (e.g. https://github.com/rust-lang/rustfmt/issues/8#issuecomment-1470141940)

Different tools operate at different stages of the process for varying and valid reasons based on their own respective purposes and contexts/constraints.

rustfmt operates at an early stage in the process, directly on the AST, which allows it do things like still being able to format code that doesn't compile. Even if we momentarily presume there's a mechanism that makes it both possible and desirable for macro authors to dictate their own formatting, there's a technical problem because that's not information that would be accessible in the earlier stages like lexing & AST generation, and would require rustfmt to shift to some post expansion, resolution, etc. which would in turn result in certain features/capabilities no longer being possible

Furthermore, I'd just pose the question as to whether or not it would truly be desirable for consumers of a macro.

Imagine a scenario where you're using multiple 3rd party macros, each of which has authors that have widely diverging formatting that starkly contradict each other (e.g. within your own codebase you've got standard rustfmt'ed code using 4 space indents immediately followed by one macro callsite where the author forced 8 space indents that's then followed by another macro call that's got 2 space indents, etc.)

Each macro author would also ostensibly have the autonomy to change their formatting whenever they wanted (i.e. i'm not aware of any semver spec requirement that would force macro authors specifying their own formatting rules to have to do a major version bump if they changed their formatting rules), so I think you'd be forced to ensure that relevant dependencies are pinned to an exact version to ensure formatting is consistent for all your contributors and that CI checks don't fail because you have an ephemeral CI environment that picks up a different version of a transitive dependency, upgrades on dependencies containing macros would potentially introduce code diffs, etc.

What about configuration options? Macro authors could conceivably want to enable configurable options for their bespoke macro formatting, and if so, where would those be defined? Would they want their users to be able to specify those in the standard rustfmt config file? how would rustfmt (and the rustfmt team) be able to marry those options correctly with whatever version of the dependency that defines the macro & macro formatting rules? what if macro authors wanted their own config file?

I ask these questions somewhat rhetorically to convey some skepticism. I'm sure there's people smarter than me that could devise grand solutions for all of these, but I'm still not convinced it would be the right solution to the problem.

A big part of the approach and bounding constraints (e.g. stability guarantee) for rustfmt are centered around tenants like consistency and minimizing formatting-driven code churn. I feel like positioning rustfmt as a general purpose formatting platform would run counter to that or at a minimum create a very conceivable surface for those tenants to be directly contradicted.

chipnertkj commented 4 weeks ago

@calebcartwright First of all, thank you for your detailed reply and summarizing the discussion so far for me. Let's see.

that's not information that would be accessible in the earlier stages like lexing & AST generation

I understand the current constraints of rustfmt with respect to early processing stages, but aren't proc macros processed in a compilation unit separate from consuming code? Perhaps this is a misunderstanding on my part, but wouldn't defining syntax metadata/parser/etc. in the same compilation unit allow potential consumers of said data, like rustfmt, to access it at any stage of processing the inputs to a macro, as long as it is exposed somehow?

Imagine a scenario where you're using multiple 3rd party macros, each of which has authors that have widely diverging formatting that starkly contradict each other (e.g. within your own codebase you've got standard rustfmt'ed code using 4 space indents immediately followed by one macro callsite where the author forced 8 space indents that's then followed by another macro call that's got 2 space indents, etc.)

I feel it would be the responsibility of the library developer to provide a formatting experience that does not conflict with the user's needs. Eg. rustfmt doesn't force you to use specific indentation, so why should they? Though I suppose there could be edge cases where a certain kind of formatting is genuinely needed for the macro to work, in which case... This seems more like a functionality issue than a formatting one, as the mismatch would exist independently of formatting tools. I hope I understood your concern correctly.

Each macro author would also ostensibly have the autonomy to change their formatting whenever they wanted (i.e. i'm not aware of any semver spec requirement that would force macro authors specifying their own formatting rules to have to do a major version bump if they changed their formatting rules)

It is impossible to force or expect a developer to interpret semver in a specific way, but this would directly impact the user experience when working with a macro. I think it would be reasonable to expect stability guarantees similar to those provided by rustfmt. Naturally, developers will avoid introducing changes that disrupt CI processes, as stability is a key factor for adoption.

I understand and share the concern of formatting and macro definition versions being entangled, and it's not something I quite have a good solution to. Maybe establishing formal guidelines or best practices around versioning for macro formatting rules would help clarify these expectations without enforcing rigid rules. If this isn't satisfactory, perhaps a community discussion could help find a common ground. 😉

What about configuration options?

If rustfmt were to adopt such an interface, external files could be one approach, potentially with a hardcoded filename and a discovery algorithm similar to rustfmt’s own. This would avoid cluttering the existing configuration schema, and it’s just one example of how these challenges could be addressed.

I’m not advocating for a specific solution here, but rather exploring possibilities to demonstrate that practical solutions are out there. Of course, these are just initial thoughts, and community feedback would be essential in refining any approach.

I feel like positioning rustfmt as a general purpose formatting platform would run counter to that or at a minimum create a very conceivable surface for those tenants to be directly contradicted.

I sympathize with that concern. This is one of the thoughts that have been on my mind. Expanding it into a general-purpose formatting platform could indeed lead to complexities that might compromise consistency and stability.

To clarify, rustfmt is mentioned here mainly because it’s been the focal point for much of the community's requests and discussions on the topics. I’m not entirely convinced that adapting rustfmt is necessarily the best or only approach, but there’s clearly significant interest in improving the toolchain's capabilities when working with macros.

Different use cases and values exist across the community. I believe it would be beneficial to create a space where these discussions can take place. I am interested in exploring how we can address these needs without compromising on the goals of any part of the toolchain.

calebcartwright commented 3 weeks ago

but wouldn't defining syntax metadata/parser/etc. in the same compilation unit allow potential consumers of said data, like rustfmt, to access it at any stage of processing the inputs to a macro, as long as it is exposed somehow?

if Rust's AST contained the information then yes, any tool that works with Rust's AST would have access to that information.

Whether Rust's AST should or feasibly could contain that type of metadata, and what the associated impacts could be is a different discussion for different people (and I know you're just trying to facilitate such a discussion, I'm just noting that we can't really speak to that beyond personal thoughts & opinions)

ytmimi commented 3 weeks ago

I understand the current constraints of rustfmt with respect to early processing stages, but aren't proc macros processed in a compilation unit separate from consuming code? Perhaps this is a misunderstanding on my part, but wouldn't defining syntax metadata/parser/etc. in the same compilation unit allow potential consumers of said data, like rustfmt, to access it at any stage of processing the inputs to a macro, as long as it is exposed somehow?

To add to Caleb's comment above, rustfmt can really only reliably and consistently format your code based on information in the AST. rustfmt is capable of formatting code that doesn't compile (as long as it parses), and it's possible to format code for a project before it's ever been compiled. Relying on the output from compilation and potentially introducing formatting differences between pre / post compilation would not be ideal.