[static-metaprogramming] Make "Refactor -> Rename..." convenient for code generated by Macros

Quijx commented 1 month ago

There are often cases where identifiers are named the same as or similar to other identifiers. For example when writing a copyWith method, the parameters are named the same the parameters they are meant to overwrite. Or an immutable class having a mutable builder class named the same, just with Builder as suffix.

Currently, when one has to rename an identifier that has other identifiers named after it, all of those can be renamed using the rename feature of the IDE (for example "Refactor -> Rename..." in Android Studio). The number of identifiers that have to be individually renamed this way is usually quite low, since only the definitions have to be renamed, and all the use sites are renamed automatically.

With macros however, when renaming an identifier, for which a macro generates more identifiers that are named the same as or similar to it, these identifiers will not be renamed. After the macros are executed again, the old definitions of the identifiers will be gone and all use sites in the code will show errors (or worse, silently resolve to different identifiers). Fixing these may require a significantly higher amount of effort than if the generated code was handwritten.

Here are just some examples of cases where identifiers are typically named after other identifers.

Parameters of copyWith methods are named the same as the field values they overwrite.
Flutter widgets generated from a function, like here: https://github.com/dart-lang/language/blob/main/working/macros/example/lib/functional_widget.dart
A public getter named the same as its private field, just without _
Immutable classes and their builder with something like https://pub.dev/packages/built_value
Variables named after their type. (Something like TextStyle textStyle.)
Generated riverpod providers are named after the functions defining them. See https://pub.dev/packages/riverpod_generator

Since these cases are extremely common, I think it is benificial if macros are somehow given the possibility to create identifiers that are automatically renamed, if another identifier is renamed.

My Proposal Idea

Now, I don't know much about how the renaming system works, so maybe this is commpletely the wrong way to go about this. But i assume that the system is separate from the macros, so "Refactor -> Rename" cannot execute code defined by macros. So if a macro would name an identifier using another in a crazy way, "Refactor -> Rename" has no idea, which identifier to rename and what to.

Luckily the most common transformations done on identifiers are probably one of the following:

Change case (camelCase, UpperCamelCase, snake_case, ...)
Add prefix (including _)
Add suffix
Remove known prefix
Remove known suffix

So my idea is that a macro can add annotations to existing or generated identifiers which should be named similarly in the sense that can be derived from the other only using the above transformations. There are two types of annotations: NameLeader and NameFollower.

A NameLeader defines an identifier from which the others are derived. It only has a required String id parameter. The id does not matter, but it must not conflict with any other ids. The macro could generate a UUID for example.

NameFollower annotates identifiers whose name can be computed from a corresponding NameLeader using the transformations above. It has a required String id and a List<NameTransformation> transformations which defaults to []. NameTransformations are for example AddPrefix('_') or ChangeCase(Case.camelCase).

Before I explain exactly how renaming would work, here an example with a functional widget of how this would look:

Hand written code:

@functionalWidget
Widget _coloredText(
  BuildContext context, {
  required String text,
  required Color color,
}) {
  return ColoredBox(
    color: color,
    child: Text(text),
  );
}

Generated Code

@NameFollower(
  'dmdkaVYrWGJo',
  transformations: [
    ChangeCase(Case.camelCase),
    AddPrefix('_'),
  ],
)
augument Widget _coloredText(
  BuildContext context, {
  @NameLeader('O1tWP0QnQjFY') required String text,
  @NameLeader('b31pT3pUOERa') required Color color,
});

@NameLeader('dmdkaVYrWGJo')
class ColoredText extends StatelessWidget {
  const ColoredText({
    super.key,
    required this.text,
    required this.color,
  });

  @NameFollower('O1tWP0QnQjFY')
  final String text;

  @NameFollower('b31pT3pUOERa')
  final Color color;

  @override
  Widget build(BuildContext context) {
    return _coloredText(context, text: text, color: color);
  }
}

If one tries to rename an identifier that is annotated with a NameFollower, the identifier annotated with a NameLeader with the same id is renamed instead. So if we try to do a "Refactor -> Rename" on _coloredText, the dialog for renaming ColoredText is opened instead. Once the NameLeader is renamed, the NameFollowers are also renamed to a name that is computed by starting with the new name of the leader and applying the transformations defined in the follower.

jakemac53 commented 1 month ago

I do see how this is a real issue that could come up fairly often, when macros are generating new declarations.

I don't love the solution proposed here, as it requires a lot of manual work to do (I think users would be expected to add these @NameLeader annotations, and every macro would have to know to add the @NameFollower annotations?). But, maybe something more automatic is possible, or a different approach to the problem is possible.

One idea could be to have instead of the rename functionality, a "fix all similar problems" functionality. So, lets say you rename your functional widget function, which changes the widget name, and now have errors everywhere that referenced that widget. What if you could go to one of those, and initiate the fix for all the other similar errors from that?

cc @pq @bwilkerson maybe one of you have ideas here too

Quijx commented 1 month ago

@jakemac53

I agree that the solution adds quite a bit of work that is perceived as boilerplate code, but at least in only requires the work to be done by macro authors and not macro users. Also the work is optional. If the annotations are not added, you will have a slightly worse time renaming things, but that may be ok for rarely used macros. So I would think that it is still acceptable, if there is no better solution. That said I tried to find a solution that works within the bounds of how I think the "Refactor -> Rename" works, but I actually have no idea :) I hope there is a better way.

I'm sceptical of the "fix all similar problems" functionality though. I feel like it's very hard to define "similar" such that no wrong renamings are done. Also it still makes it quite annoying to rename things, since you still have to fix at lease one use site. If a macro generates multiple things with derived identifiers, every one would need its own fix.

lrhn commented 1 month ago

Macro generated code should not be edited, including renaming. If the original changes, the macro should run again and generate new names.

I think it would be more useful for non-macro-generated code to be able to say @NamedAfter('Foo') so that the analysis server can remember related declarations too.

Quijx commented 1 month ago

@lrhn After the "Refactor -> Rename", the macros would rerun again anyway, right? So the renaming would only be very temporary and basically only affect the use sites. If the macro then decides to rename the generated identifiers in a way that does not match the way it is described in the annotations, or does not generate the identifiers at all, there would be compile time errors. So the macro author would have to pay attention to implement the name generation the same as the annotations.

If the limitation that macro code can not be renamed is technological, maybe only the use sites in non generated code could be renamed?

I think it would be more useful for non-macro-generated code to be able to say @NamedAfter('Foo') so that the analysis server can remember related declarations too.

How would the work if the declaration is macro generated though? And this would offload the entire work to the macro user and not the author, right? If so, I doubt users would use that much, and they would just fix the resulting errors by hand, instead of adding annotations before renaming.

jakemac53 commented 1 month ago

but at least in only requires the work to be done by macro authors and not macro users.

It does require the user to opt in though, in the functional widget example, and I think all of these examples? On the original declaration, the user has to write a @NameLeader annotation, invent some ID (also, how should they do this), etc.

Possibly, a macro could also add that annotation (we don't have that API but augmentations do support it). So, that could help out. Macros would also have to be super careful about generating deterministic IDs, and I am not sure they really have the ability to do so. It seems like a lot of work and boilerplate just to get rename working well.

Quijx commented 1 month ago

It does require the user to opt in though, in the functional widget example, and I think all of these examples? On the original declaration, the user has to write a @NameLeader annotation, invent some ID (also, how should they do this), etc.

Yeah, I assumed that augumentations can add annotations.

The way I envisioned it, the user would not have to do anything but add the usual @functionalWidget annotation. The @NameLeader or @NameFollower annotations would be added by macros via augumentations to the hand written declarations or of course the generated ones. So the macro user should not have to invent ids or add extra annotations.

Macros would also have to be super careful about generating deterministic IDs

Yes, we would need a way to get an id associated with a given identifier probably. Maybe this could be a hash of the file path, the identifier name and the "parents"? So for example a hash of my_package:lib/functional_widget.dart/_coloredText.color for the color parameter in _coloredText or something like that? That wouldn't work in some edge cases though, for example with two unnamed extensions in a file that contain two methods with the same name. Not sure.

bwilkerson commented 1 month ago

That said I tried to find a solution that works within the bounds of how I think the "Refactor -> Rename" works, but I actually have no idea :)

The rename refactoring works by having an index built ahead of time that (logically) maps every declaration to a list of the references to that declaration. Having such an index allows the server to efficiently find all of the other places that need to be updated when the declaration is renamed.

What's currently missing from the index is information about

declarations whose names are derived from the declaration to be renamed and
the pattern used to derive the names of those declarations.

I believe that having that information would give us the ability to find the declarations that will be re-generated, the references to them, and to compute the new name that will be generated. That, in turn, would allow us to update all of the references in non-generated code in anticipation of the macros being re-run.

Of course, the quality of the UX would depend on the pattern being accurate. If the generator sometimes needed to make exceptions to the pattern it normally uses then this feature could make it even harder for users to manually fix their code than if the feature weren't implemented.

The annotations are an interesting idea, and would give us the needed information, but I agree that the UX is suboptimal. It might be the best we can do, but it's worth trying to find a better solution in case we can do better. I can think of two possibilities, both of which require getting information from the , though I haven't thought deeply about either of them so there might be problems with them too.

Automatically populate the index.

The idea here is to automatically capture the same information as would be specified by the annotations without needing to write them (either manually of generated).

I don't know whether it's feasible, but it would be ideal if the macro could specify

the declaration being created,
the declaration on which the name is based, and
the pattern used to derive the new declaration's name,

and if this information could be captured in the index automatically.

Automatically fix the resulting errors.

Similar to Jake's question about fixes, we might be able to allow a macro to produce a temporary fix_data.yaml file specifying the new names of the regenerated declarations. The server could then apply fixes after the renaming was complete and the new diagnostics had been generated.

That wouldn't be nearly as good as a UX, but it wouldn't require the user to do any work and would avoid the problem of having to specify a pattern. It would also require that the macro had access to the old names of the declarations, and it probably doesn't, so it might be a non-starter.

tatumizer commented 1 month ago

This idea probably won't work, but I wonder why. If we know why, there can be a way to fix it. Suppose the user wants to rename foo->bar somewhere in the code. Execute the following steps

Run macros on old code, save the results
Run macros on a new code (which differs only in that foo is replaced by bar)
Compare the results by looking for all places where foo in the saved code was renamed to bar in the new code. Suppose we find foo_123 renamed to bar_123, Foo - to Bar
In the declarations of foo_123 and Foo (they reside in the old generated code), simulate manual "rename", as if the user invoked "rename" for foo_123 -> bar_123, Foo -> Bar, etc.
Repeat recursively.

lrhn commented 1 month ago

The train that won't work is that either you renamed the original declaration, and then the macros will generate derived declarations with new names too, and there is nothing new to do, or you renamed a generated name, and then macros will overwrite that and nothing changes. If you are allowed to rename a macro generated declaration at all (I wouldn't allow that).

The annotation idea can work, if it annotates the macro generated declaration with a link to the original declaration that it is named after, in which case attempting to rename the derived name should redirect to the original declaration, either as a message, "cannot remember macro generated code, try renaming Foo of path/file.dart instead", or by directly triggering s renaming of that declaration. In either case, it only needs the annotation link in one direction.

If macros generate a source-map too, maybe that can be used.

tatumizer commented 1 month ago

you renamed the original declaration, and then the macros will generate derived declarations with new names too, and there is nothing new to do

I think there's still something to do, it's just our interpretations of the problem are different. Example of user's code:

@generateCopyWith()
class Foo {
  final int a, b;
  Foo({this.a=0, this.b=1});
}
Foo u=Foo(a:5, b:10);
Foo v=u.copyWith(a:42); // using generated method with parameter 'a'

Now suppose the user renames the field int a in Foo to int x. Indeed, the macro generates new code, but the parameters of the generated copyWith are different

BEFORE: Foo copyWith({int? a, int? b) {...} AFTER: Foo copyWith({int? x, int? b) {...}

But the use site in user's code Foo v=u.copyWith(a:42); still refers to the old name of a parameter. It should be modified to Foo v=u.copyWith(x:42); automatically. I tried to address this problem in the previous comment so that the use site is modified accordingly.

(I agree that the user should be allowed to use the "rename" feature only for their own declarations IMO When I say "simulate manual rename", I mean compiler's simulation of the action the user is not allowed to do. Not the best wording, but I don't know how to put it better. )

lrhn commented 1 month ago

That does make sense. I was thinking only of declarations (because that was the ones we wanted to annotate), but the uses will need to be renamed too.

tatumizer commented 1 month ago

Yes!!! My claim is that everything can be done without any annotations, based on the efficient diffing only. Please provide counterexamples!

lrhn commented 1 month ago

Seems plausible, but not trivial. The "efficient diffing" probably requires having both versions in memory at the same time, or at least a useful summary, and finding some kind of connection between the declarations of the two versions. It needs to recognize that a different declaration is (probably, heuristics may be enough) a rename of a previous one, not just a new and unrelated declaration.

If the analysis server knows that the only change between the two versions was a rename refactoring, it may be reasonable to assume that all changes in the program are renames too, but that does depend on the macro. If a class renames its hand-written toJson to something else, the macro may now generate a default toJson method, which is not a rename of anything.

The analysis server can't just check that the bodies of functions are the same, because the changed name may be embedded in the body too, say a generated throw UnupportedError('<methodName>');.

tatumizer commented 1 month ago

Good example. The algorithm could be fixed though.

Definition: The macro is called 'well-behaved' WRT renaming of X to Y in the user's code iff such renaming triggers only the "renaming" of token "X" to token "Y" (case-insensitive; should account for the removal of leading _, etc - need to elaborate) in the generated code, and no other changes. (This implies that the well-behaved macro is idempotent)

Example of well-behaved macro: the user renames foo to bar in the code. (Suppose the change is owner-invariant) Then the macro generates new code, which effectively renames foo_123 to bar_123, FooBuilder to BarBuilder, etc., and contains no unrelated changes.

(The macro, of course, doesn't know the old name, so it doesn't literally "rename" anything, but the simple pointwise "diff" can detect what effectively was renamed to what. As a side-effect of the diff, we will also discover whether the code contains any unrelated changes).

If the macro is not well-behaved with respect to the specific renaming X->Y, the changes won't be propagated to the use sites. (this is a correct behavior in the case of toJson).

The macro can be tested for well-behavedness for sensible changes in advance.

EDIT: simplified the definitions: well-behavedness subsumes all other assumptions.

The "efficient diffing" probably requires having both versions in memory at the same time, or at least a useful summary, and finding some kind of connection between the declarations of the two versions.

We probably need only to preserve (and compare) the declarations in the generated code. The definition of well-behavedness implies that the diff can be computed by a linear scan (no graphs).

(thinking more about it: the complexity of Miller/Myers algorithm is O(N^2); there's a package in dart that implements it. The size of concatenated declarations in the generated code might be small enough to make the overhead of diff tolerable)

rrousselGit commented 1 month ago

I've said this before, but IMO there's an overlap between this and the "redirect go-to-definition to somewhere else for generated code".

Macro authors will typically want both at once. It might be worth to bundle both in a single annotation

tatumizer commented 1 month ago

Could you please provide a concrete example of what "redirect go-to-definition to somewhere else for generated code" means?

dart-lang / language