dart-lang / language

Design of the Dart language
Other
2.67k stars 205 forks source link

Macros: How to gracefully handle name conflict between user input and generated code? #3345

Open rrousselGit opened 1 year ago

rrousselGit commented 1 year ago

A common failure point of code-generators is, there may be some name conflict between user-defined code and generated/inherited code.
In a way, generators tend to have "reserved variable names" – which tend to vary based on the version of a given package.

If a conflict happen, that will generally trigger a compilation error in the generated code. That's fine, but not ideal in terms of UX. Would macros possibly offer a way to streamline this? Or would macro authors have to deal with this on their own?

rrousselGit commented 1 year ago

Adding to that, it's quite common that generators involuntarily break semver due to this.

Maybe a new version of a package adds a new feature, which implicitly adds an extra "reserved variable name" to the list. Breaking existing code in the process, but in a very underhanded way.

I've certainly done so inadvertently a few times already.

jakemac53 commented 1 year ago

In general, this problem should be less present in macros (and augmentation libraries?).

We have talked about a way to generate random, guaranteed unique, names as well. But I am not sure if that would solve your use case.

rrousselGit commented 1 year ago

I don't think this problem will be less present really.

Here's one example I am facing. Users can write the following:

@riverpod
class Example extends _$Example {
  @override
  Whatever build(<any number of parameter the user define, named or positional>) {
  }
}

And this generates something among the lines of:

abstract class _$Example extends SomeBaseClass {
  <for every parameter in "build", a property with the same name is added on this class>
}

So for example with:

@riverpod
class Example extends _$Example {
  @override
  Whatever build({required int page, String? state}) { }
}

Then we have:

abstract class _$Example extends Notifier<int> {
  int get page => /* ... */
  String? get state => /* ... */
}

Here's the thing though:

Notifier<int>, the inherited class, already has various properties. Including one already named "state". As such, the String? get state conflicts with Notifier.state

In fact, there are more than just one place in the generated code having the same issue. Parameters on build are present in the generated code in multiple places with different subclasses, so different possible conflicting names.

Macros would likely do nothing here. Nor would guaranteed unique names.

And I've had this issue in effectively all of my code-generators, because they all do something similar (aka porting parameters defined on a function somewhere else under the same name)

mateusfccp commented 1 year ago

This is a very old problem with code generation. Maybe Dart could provide something like Common Lisp's GENSYM to generate unique symbols?

A con of this is that everytime the macro is run the generated code will have different parameter names. This could be mitigated with commentaries that maps to the original name:

abstract class _$Example extends Notifier<int> {
  int get s1203981 /* page */ => /* ... */
  String? get s0194815 /* state */ => /* ... */
}
jakemac53 commented 1 year ago

You would be able to inspect the super-class chain to identify this situation, but yeah I don't see how you could avoid it. I don't think it really has anything to do with macros?

I think you just have to tell users that they can't have a field with that name.

This is a very old problem with code generation. Maybe Dart could provide something like Common Lisp's GENSYM to generate unique symbols?

Right this has been discussed and could happen with sufficient motivation but I don't believe it solves this problem. The users are expecting public fields with specific names to be generated, those names are just already taken.

jakemac53 commented 1 year ago

Note that just doing nothing as a macro author is also somewhat fine. The user will get an error just like they would today if the invalid code was in a generated part or library file. It will just be in an augmentation instead.

mateusfccp commented 1 year ago

Right this has been discussed and could happen with sufficient motivation but I don't believe it solves this problem. The users are expecting public fields with specific names to be generated, those names are just already taken.

Yeah, I think I got confused about the issue here.

rrousselGit commented 1 year ago

Note that just doing nothing as a macro author is also somewhat fine. The user will get an error just like they would today if the invalid code was in a generated part or library file. It will just be in an augmentation instead.

The main issue is UX. They do get an error, but it's usually a very cryptic error.
I was wondering if there was a way to maybe officially handle such scenario to provide a native error for this – akin to reserved language keywords but for macros.

I can do something on my own. But it's likely a very common issue with macros (& codegen in general), so I figured an official feature may make sense here.

jakemac53 commented 1 year ago

There could definitely be some special error, like "macro X tried to add a field int x which wasn't a valid override of the getter String get x from SomeSuperType", etc. That could show up on the macro annotation itself, so the user might have better context.

jakemac53 commented 1 year ago

I do wonder if the right way to approach this is making the general purpose error for invalid overrides less confusing? If people aren't understanding that error, making it better could solve both issues.

rrousselGit commented 1 year ago

I'd add that we can't quite rely on "isn't a valid override" error. There may be a name conflict that does compile without any issue.


Maybe one thing we could think of is, generated symbols imported from user code could be annotated with a special annotation. Such that if there's ever a name conflict, an error on the user-defined symbol would be shown?

For example, we could have:

abstract class _$Example extends Notifier<int> {
  @FromUserCode('[Example.build#state]')
  String? get state => /* ... */
}

or something among those lines.

And if state overrides a property from Notifier, valid override or not, then we'd see an error on Example.build#state

This could even overlap with other similar features, such as:

jakemac53 commented 1 year ago

What makes macros different in this regard compared to regular code? I would argue overriding any concrete getter/setter (maybe magic ones from a field), with another field, should always give some sort of diagnostic, even in user code. At best it is a code smell.

rrousselGit commented 1 year ago

There's a layer of indirection due to code-generation involved. When a user defines a field in one place, they likely do not realise that this may override a different field under the same name in a different place in generated code.

munificent commented 10 months ago

I agree that this is a problem, but I think it's somewhat fundamental and inescapable.

When a user defines a field in one place, they likely do not realise that this may override a different field under the same name in a different place in generated code.

To some degree, they have an obligation to realize that. When a user chooses to invoke a macro, they are asking it to generate some code on their behalf. If the macro is generating public members, then the user presumably does want to use those members (otherwise the macro should make them private), in which case they do need to know that they'll be created and could collide with other members.

rrousselGit commented 10 months ago

I don't think it is as fundamental as we make it to be.
It likely should be feasible to have some built-in errors or warnings for such scenarios.

For example, we could have an official convention for fields that are copy-pasted in generated code from the annotation class. This would be similar to how overrides should always be annotated with @override.

IMO there's a lot of overlap between this and other macro-related features, such as the "go to definition override".

rrousselGit commented 10 months ago

In fact thinking about it, I'd expect that in most cases, there should naturally be a warning in the generated code: Chances are the implicit override does not have an @override annotation:

@macro
class Example {
  String? name;
  int? age;
}

// Imported:
abstract class BaseClass {
  String? name;
}
// Generated:
class ExampleGenerated extends BaseClass {
  // Fields imported from [Example]
  // This overrides [BaseClass.name] but does not have the @override, due to being a mistake
  String? name;
  int? age;
}

Maybe one thing we could do would be to warn on the @macro if generated code have a "missing @override" warning in it

jakemac53 commented 10 months ago

The question I think is where to draw the line exactly here on diagnostics in macro generated code.

I definitely agree that the diagnostics about missing @override annotations should be applied to macro generated code. Probably a lot of other diagnostics too. But which ones exactly? It is hard to say.

One idea would be to say we apply the "core" lint set to all macros, but even that set actually comes from a package, which complicates things.

Ideally, I think macros themselves would choose which diagnostics are applied to their own output, so they have a single consistent target to hit and not a moving one.

rrousselGit commented 10 months ago

Note that in the case of missing @override, I wouldn't stop at putting a warning in the generated code. I'd apply that warning on the associated @macro too – with maybe a way to override the location of this warning to show-up on the problematic corresponding user code.

The goal would be that for such a warning, users shouldn't have to look at generated code to see that there is an issue and what is causing the problem.