dart-lang / language

Design of the Dart language
Other
2.65k stars 203 forks source link

Macro application annotation identification #3728

Open lrhn opened 5 months ago

lrhn commented 5 months ago

EDIT: Updated to allow designating a constant variable as a macro application annotation, and removed the attempt to recursively follow variables or type-aliases. Only the macro application declarations that are mentioned in the package configuration will trigger macros when used as annotations.

Compilers need to recognize annotations as macro application annotations, and they need to do so at a point in compilation where the program isn't yet complete (because macros haven't run), which means expression evaluation isn't necessarily possible. The existing source is not a Dart program, and the language only defines semantics for complete and valid Dart programs.

Since macro applications are annotations, and annotations are expressions, we need some way to recognize an expression as representing a macro, and preferably without needing to evaluate another annotation to mark the first annotation class as a macro application.

Here's an attempt to flesh out an idea, based on in-person discussions from the joint meeting week, of defining which declarations are macro application declarations in metadata, completely ouside of the source.

Pubspec/package-config-defined macro annotations

We want to recognize specific annotation expressions as macro application triggers.

We do that by singling out a macro application annotation class or macro application annotation variable, and then considering every annotation which constructs an instance of the class or is a reference to the variable, to be macro application triggers.

Further, we associate a macro implementation with the macro application. There are a number of ways we can do that, I'll mention some and leave that open for now.

The metadata specifying which class is a macro application trigger, and which macro implementation it executes, will be available in the .dart_tool/package_config.json file, as a non-breaking addition to the format. (The format for a program that doesn't depend on any macro-defining packages will be the same as today, and having a macro available will add extra information.)

The information in the package configuration is generated from similar information in the declaring package's pubspec.yaml, and written by the Pub tool along with the rest of the .dart_tool/package_config.json file.

Pubspec format

A package which introduces a macro application class can add (something like) the following to its pubspec.yaml:

macros:
  - 
    application:
       library: macros.dart
       name: MyMacro 
    implementation:
       library: src/macros/my_macro_impl.dart
       name: MyMacroImpl
  - 
     application:
       local_library: tools/test_macro.dart
       name: MyTest
     implementation:
       local_library: tools/test_macro_impl.dart
       name: testMacro

The library value is a path inside the current package's lib/ directory. That is, the path of a package: URI without the leading package:mypackagename/. This file can be referenced by other packages.

The local_library value is a file path relative to the pubspec.yaml file itself, which must not be inside the lib/ directory. It allows a package to declare a local macro for its own use only, for example a macro only used by its own tests.

The name value is an identifier, which should be the name of a top-level declaration inside that library.

Files inside lib/ must not depend on a local macro. (They can't without importing using a file: URI, so that should just come naturally.)

The implementation: (somehow) specifies how to run the implementation code.

Package-config format

This information is written into the package-config.json file for any package which depends on the macro-declaring package.

Local macros are only included for the current package, public macros are included for all packages in the program.

The format for a package: entry is changed to include a macros: [] field:

   {
      "name": "my_package",
      "rootUri": ".../pub_cache/hosted/my_package/1.2.3/",
      "packageUri": "lib/",
      "languageVersion": "3.7",
      "macros": [
        { 
          "application": {
            "library": "macros.dart",
            "name": "MyMacro"
          },
          "implementation": {
            "library": "src/macros/my_macro_impl.dart",
            "name": "MyMacroImpl"
          }
        }
      ]
    }

The library URI is relative to the surrounding package's package: URI root: package:my_package/. Which means that both the macro application annotation and the macro implementation entry point must be in the same package. (They can always call code from other packages afterwards.) It's not possible to designate an existing annotation from another package as a macro trigger. No making @override a macro trigger!

A local_library entry can be included for the current package, which has a path relative to the package_config.json file itself. (It can use non-relative file paths, but generally shouldn't. I guess it can technically use a package: URI as the implementation, but it should use library for that.)

  {
    "name": "my_package",
    "rootUri": "../",
    "packageUri": "lib/"
    "macros": [
      {
        "application": {
           "local_library": "../tools/test_macro.dart",
           "name": "MyTest"
        },
        "implementation": {
           "local_library": "../tools/test_macro_impl.dart",
           "name": "testMacro"
        }
      }
    ]
  }

No entry may have a library path for the application and local_library for the implementation. The other direction is allowed, and is then considered a package local macro.

Tool usage

With this information available, a source processing tool (compiler, analyzer) can recognize a macro application class syntactically. It's an annotation referring to the declaration with the given name in the given library. Above the it's the MyMacro declaration of the package:my_macro/macros.dart library.

Since the program (and even the current library, if it has a macro appliication annotation) is not complete, the macro tool needs to have its own preliminary name resolution, which predicts which declarations, with which names, are exported by which libraries, and which then builds the preliminary import and declaration scopes for each library. Then metadata annotations are resolved against these preliminary declarations. Since the program is incomplete, not being able to resolve a name is not an error.

The next step is then recognizing source annotations which are instances of that declaration.

To do that, the compiler (I'll use that to cover any program which may run macros) can directly recognize a metadata annotation whose constant expression is either an invocation of a constructor of a macro-application class declaration, or a direct reference to a macro-application constant variable declaration. That is, either:

This is very limited, and deliberately so. There is no conditional expression. You can't do const macro = isWeb ? MyMacro.web() : MyMacro.native(); to choose between two different macro implementations, rather the macro constant itself defines which implementation to use. One can do const macro = MyMacro(isWeb); and handle the conditional behavior inside the macro implementation.

Notice that the "library and name" denoting a macro application declaration are checked against the resolved declaration's original library, not any library it might have been exported through. There has to be a declaration with the specified name in the specified library, not just in its export scope. (A type alias declaration suffices, the actual class can still be declared in another library.) This design is chosen so that import resolution doesn't have to remember the export path(s) that the name is imported through, only where the declaration was originally written.

Author support

This design requires a macro author to update the pubspec.yaml file every time they make a change to the naming or location of a macro application annotation class or a macro implementation. That gets old fast.

I propose that the analysis/language server helps with this, by having a @macro annotation added to dart:core. Then an author can annotate the macro application class as:

@macro("src/macro/impl.dart", "MyMacro")
class MyMacro {
  // ...
}

and get a diagnostic, with a quick-fix, if that doesn't match a macro listing in pubspec.yaml.

That is, the way to write a macro is to write:

@macro("src/macro/impl.dart", "MyMacroImpl") // Path relative to current file.
class MyMacro implements SomeMacroInterfaces {
  const MyMacro();
}

and then have a quick-fix insert the corresponding lines into pubspec.yaml and update package_config.json, and have another quick-fix create a missing src/macro/impl.dart if it doesn't exist and add a class MyMacroImpl implements ProperMacroInterfaces {} to it.

Probably even allow the implementation to have a back-reference, with the creation quick-fix adding it automatically:

@macro.impl("/macro.dart", "MyMacro")

so that if just one of the @macro annotations get out of sync with declarations, they can be auto-fixed to the new name and location, based on finding the back-link.

Macro implementation execution

The macro implementation is run as a separate Dart program, in a context where it has access to the "macro execution context" that triggers the macro execution, and which allows introspection and code generation. That context is provided as an argument in each step where the macro is executed.

Macro implementation code is completely normal Dart code, with no new features. It may even contain macro applications itself.

Currently the macro implementation is defined a class declaration, which means that it's implicitly constructed by the macro execution context (it must have an unnamed constructor that can be called with zero arguments). That can still work. There is some set-up that happens before macro execution starts, so it's not unreasonable to allow that set-up to be generated specifically for the macro class, so that it can be instantiated. (Or use dart:mirrors. Or not!)

Other options are:

All of these approaches, including class-with-zero-argument-constructor, have the restriction that a macro implementation cannot do asynchronous initialization. The solution to that is to delay that initialization until the start of phase 1. Probably not a problem in practice.

I suggest using the same restriction as for macro annotations: The name should be the name of a type with a constant unnamed zero-argument constructor, or a constant variable already containing a value.

Annotation Introspection

When the macro implementation runs, it can introspect on the macro application annotation. If that annotation refers to a variable, then the library and name of that variable may be all available information. That should be fine for macros that are not parameterized.

When introspecting on a constant constructor invocation, the macro may be able to get some information about type arguments and arguments, at least their syntax. For type arguments, their name may be enough to refer to the same type again. For argument expressions, some expression structures may be possible to reify as values in the macro implementation (for example something JSON-like, with only list, map and simple value literals). Other arguments may be completely opaque, at least until later phases of macro execution.

Nothing is finalized about this yet.

Post-macro-execution

The macro application annotation is not removed by macro execution, it must still (or finally) be valid as a constant expression in the completed program.

If macro generated declarations change the preliminary binding of a name, either one denoting a macro application declaration, or a name occurring in the arguments to a macro application constructor, then the macro subsystem decides whether that should be considered an error or not. The language doesn't consider the validity of partial programs, it does not have an opinion until the program has been completed by running all macros. That is: Macros do not add any new language features. It relies on the augmentations feature, but that is a separate feature. The macro feature is a tool feature, a Dart program pre-processor that runs on something that may not be a valid or complete Dart program, and therefore not something the Dart language has any opinion on. The macro preprocessor runs completely normal Dart code as macro implementation, and only when that's done and the program is expected to be complete, does the Dart language have any say about it, at which point it too should be a completely normal Dart program.

johnniwinther commented 4 months ago

We could be even more lax about what names can trigger macro application and allow any identifier exported from the library to trigger macro application. This would support application through const constructor invocation, like @Macro(), and constant references, like @macro where const macro = ....

lrhn commented 4 months ago

@johnniwinther SGTM, will rephrase. And drop the recursive checking through aliases and unmarked constants. We're working with Dart code at a point before constant values can exist. It's OK to be a little pedantic about what names (might) mean.

rrousselGit commented 4 months ago

Editing the pubspec doesn't seem to be a requirement using the current dev branch. Would that be a new requirement?

In that case, would not relying on @macro for macros be a solution? If we were to instead use a separate syntax, such as #macro instead, would this remove the need for listing macros in the pubspec?

lrhn commented 4 months ago

If this approach is chosen, the tools (analysis/language server) should update the pubspec.yaml for you, if at all possible.

All you would have to do is write, fx, lib/src/macro.dart:

@macro('/src/macro/my_macro.dart', 'MyMacroImpl')
const myMacro = _MyMacroTrigger();

then tools should tell ask you if you want to update pubspec, and it'll automatically insert

macros:
  - 
    application:
       library: src/macro.dart
       name: myMacro 
    implementation:
       library: src/macros/my_macro.dart
       name: MyMacroImpl

which will probably trigger pub get to update .dart_tool/package_config.json.

Next thing it'll tell you that there is no src/macros/my_macro.dart with a MyMacroImpl class, and ask if it should create it.

That's obviously me being highly optimistic about the work of someone else. (Hi Analyzer people! You're doing great!)

Using a syntax like #macro (how?) seem like it will need a language change, which isn't needed otherwise. (I prefer to not have macros introduce any changes to the language, which aren't usable without macros too.)

davidmorgan commented 4 months ago

Thanks Lasse! I filed #3847 for the annotation introspection issue since it's quite a deep one :)

Generally, looks good to me :)

My one suggestion would be maybe to generalize the first part away from macros: a mechanism to write structured information for the build tools in pubspec.yaml, to be passed through package_config.json, seems like it might be useful for other things. The mapping from @macro to pubspec.yaml enforced by a lint + quick fix could also apply more generally. Maybe some word like pragma, precompile, define, ...

With respect to macros it's possible we end up wanting to add to the structured data e.g. to pass hints related to optimization to the build tools. So even if it's just for macros a bit of leeway for additions would be nice.

Thanks :)

davidmorgan commented 4 months ago

@lrhn Another note related to "we might want to store more data this way", I believe the macro metadata proposal from @johnniwinther #3847 might include wanting to specify annotations that are not macros but are "more introspectable" by macros.

It would essentially mean that modular compiles store enough information to reconstruct the annotation as source, in addition to as a const value.

Possibly this is a fit for the same mechanism.

lrhn commented 4 months ago

Storing more data starts to worry me. If we expect larger amounts of data, not just "~1 or less" per package, it might be better to have a separate file, and a general feature for producing metadata accessible to tools from pubspec.yaml entries of all the available packages.

Something like

  tool_data:
    macros:  # key to add entries to
      ... Arbitrary JSON-data values
  dev_tool_data: # only for same package
     macros: ...

Where all the data from all transitive dependencies, plus dev-data from the current package, is collected, put into lists by their key, expanded if they are already lists, and written to individual files per key, like .dart_tool/tool_data/macros.json, and then any tool can read what they need. Maybe the file is sectioned by originating package. That's probably safer, then paths in the data can be package relative.

(Pub will have to decide when to GC that directory, which could be every time it regenerates it.)

davidmorgan commented 4 months ago

Thanks.

"A lot of data" can mean very different things depending on context ;)

To see if it's needed to split anything out, maybe we can ask: are there any tools that want to consume package_config.json but do not want this data?

For most of the cases we've talked about, the analyzer and CFE definitely want the data, it's needed to understand/compile the code. Hints related to performance would be a grey area, they are needed to understand/compile the code quickly :)

Pub itself will care about the data if it wants to fetch/prebuild macros.

So while I agree that introducing a bucket of data for people to throw things into is a dangerous thing, I suspect a sign saying "only for data we always want when compiling" may be correct, and a reasonable idea if we think we can stick to it :)

A view from the point of view of build performance: the google3 build has to aggregate and merge package JSON files across the build graph, the complexity and build time cost is non-zero but does not show as a big cost in the final analysis. Anything that significantly (2x? 5x?) increases the size of package_config.json would be a concern, but probably workable. I guess we are looking at <2x, largely because the number of "special" packages should be small.

modulovalue commented 4 months ago

Just a quick question: Is this proposed @macro-annotation-machinery only relevant to macro authors, or will consumers of a macro also have to use the @macro-annotation-machinery to declare that a particular declaration is meant to be eligible for macro expansion?

Since @davidmorgan has closed https://github.com/dart-lang/language/issues/3604, it looks like the answer to the second question is yes, correct? So this proposal states that Dart's annotations (except for @pragma and @macro) will no longer be expected to modify the semantics of a given program in any way (outside of any external third party tooling)?

davidmorgan commented 4 months ago

@modulovalue It's for macro authors only; the answer to the second question is "no".

Your own package's package config aggregates the package config of all your dependencies; so if the macro author makes an annotation special in their package config, it's special in your package config too.

lrhn commented 4 months ago

might include wanting to specify annotations that are not macros

This was what worried me. If we're generalizing, we may want to worry about how far we'll be generalizing, not just the immediate plans, because such things tend to escalate. (Until package_config.json interpreters implement all of Common Lisp.)

If we have a good hunch that this is a one-off, and later features are not going to want their own data in package_config.json, then it'll probably be fine. It'll be a little larger, and the tools that need the configuration are also the ones reading package_config.json to begin with.

But if we can just as well put the macro information into a separate macro_config.json file, maybe we should just do that. Just to be safe.

davidmorgan commented 4 months ago

There is some cost to adding an additional file:

so I would err on the side of not adding a new file, we could "always" split one out later ;)

jakemac53 commented 3 months ago

I want to mention one potential pitfall of moving to yaml over Dart files - you might now be paying an extra cost for any macros provided by any package you depend on, even if you don't use them. Previously, you definitely only paid for them if they were imported. We should be able to work around that with good tooling, but it is worth keeping in mind.

davidmorgan commented 3 months ago

That seems okay: macros should usually be either 1) used in the package where they are defined, or 2) part of a package you depend on exactly if you want the macro. That still leads to more cost if a package provides multiple macros--but the cost of compiling multiple macros in one package should not be much different from the cost of compiling whichever subset you happen to use. And anyway this can be better for some cases: e.g. it will be quicker if you then start using one of the others :)

jakemac53 commented 3 months ago

Re-visiting this, it probably is worth bike shedding a bit on the specifics.

library vs local_library

Specifically for me, the library versus local_library distinction is odd and we should remove it.

I would suggest that we just accept a relative path, from the root of the package, for all of these. There is no explicit need to make the user think about the difference between them.

So, the original example would become:

macros:
  - 
    application:
       library: lib/macros.dart
       name: MyMacro 
    implementation:
       library: lib/src/macros/my_macro_impl.dart
       name: MyMacroImpl
  - 
     application:
       library: tools/test_macro.dart
       name: MyTest
     implementation:
       library: tools/test_macro_impl.dart
       name: testMacro

We should still have a restriction that if the annotation library is under the packageUri, the implementation library URI must also be.

Rename application key to annotation

Self explanatory, I think we should call it annotation (it is the metadata annotation class that will trigger a macro application, but users just think of it as an annotation).

Should we allow general URIs?

We could allow any URI for the library key - so you could apply a macro from a different package as an example, using your own annotation.

This might be a bad idea though - if specified too loosely it could allow a package to attach its macro to any arbitrary annotation, which would be bad for modular builds and in general allow monkey patching of just about anything.

But, it also feels a bit weird to require the annotation and macro to be developed in the same package. Probably a good place to start though.

Name referring to top level getters/fields

The code bootstrapping a macro can't introspect, so it doesn't know if the name refers to a top level field/getter or macro class.

I would be inclined to at least initially just go with this suggestion:

I suggest using the same restriction as for macro annotations: The name should be the name of a type with a constant unnamed zero-argument constructor, or a constant variable already containing a value.

Fwiw, this isn't actually how the current semantics work, there can be arguments. But, we should not allow arguments in this case.

lrhn commented 3 months ago

We should still have a restriction that if the annotation library is under the packageUri

To nit-pick, packageUri is a package_config.json concept, and this is pubspec.yaml, so it's just "under lib/".

That should then work. If the annotation is under lib/, the implementation must be too. If the annotation is not under lib/, the macro is not available in other packages at all (won't be written to the package config), and cannot be used under lib/ of the same package.

The paths written to package_config.json can be the same ones, relative to rootUri, and the tools convert paths under lib/ to package URIs to recognize references.

The library URI plus name should still denote an entry in the export scope of the library, not necessarily the original location of the declaration. That can make it a little harder for tools, which have to remember which library exported the annotation, but they should probably mark the name while resolving imports, not try to figure out the source afterwards, when a name can be imported through multiple imports.

That also means that you can technically attach a macro to @override, as long as you can get someone to import it through your library's re-export. We could require that the annotation must be declared in the same package as the pubspec.yaml that makes it a macro supplication.

davidmorgan commented 3 months ago

Hmmm apparently I somehow missed the "must be in the same package" part on first reading.

I think it's important to leave room for the implementation to be in another package: it buys a huge amount of flexibility for the whole program version solve if it's only the annotations that are forced to be in the version solve, not the implementations.

I know we don't plan to use that for now, but I do think it's worth keeping the option open. It's complexity we already have today for generators and their annotations, and while it's a bit annoying, it's always felt like necessary complexity.

jakemac53 commented 3 months ago

The only real requirement is that if the annotation is public, the macro implementation must also be public. They could come from different packages though and imo that would be fine.

I do not think it is likely we will ever get to a world where macros have a different version solve though, and don't think we should consider that possibility when designing the feature tbh:

If we want to consider a special kind of dependency instead, which is private to the package, we could use that for all macro implementation deps. So for instance, the macros dependency in a macro package could be a "private" dependency which would allow it to be different from the version that other macros get. However, this would make macros less composable, you couldn't as a macro import another macro and directly invoke it, because your types won't match up.

davidmorgan commented 3 months ago

The only real requirement is that if the annotation is public, the macro implementation must also be public. They could come from different packages though and imo that would be fine.

"Fine" sounds good to me ;)

I think it does introduce complexity, though: if the implementation package is not a dep of the annotation and not a dep of the package using the annotation, how does the version solve work? Maybe its version is "any" and we rely on a version constraint from the implementation package onto the annotation package to pick a good version. Or, maybe you can specify a version when you refer to it in the pubspec.yaml?

I do not think it is likely we will ever get to a world where macros have a different version solve though, and don't think we should consider that possibility when designing the feature tbh:

I think it's a possibility we should allow for; you can read about the problems Swift has because all macros must compile with one version of SwiftSyntax. Note that it's a problem even though Swift offers #if canImport, which we don't have an equivalent for.

  • Macros are tightly coupled to their annotations, we cannot ignore the macro constraints on those annotation packages when considering the users version solve.

I think that's fine? Macros can evolve their annotations carefully, and give careful version constraints implementation->annotation, which we need to respect when fetching the implementation. Because the macros access the annotation via introspection they can be compatible with a bit of effort over changes you might think would be breaking, like built_value is today.

  • Macros also are going to be tightly coupled with other helper packages, which need to match up with the users constraints on those same packages.

Do you mean that the macro will generate code that uses a helper package, and maybe create imports onto it?

I can see that it would make more sense to specify the version constraint onto that package from the implementation package.

That said I can't actually think of examples of such a helper package today--isn't it always that the helpers go next to the annotation?

If we want to consider a special kind of dependency instead, which is private to the package, we could use that for all macro implementation deps. So for instance, the macros dependency in a macro package could be a "private" dependency which would allow it to be different from the version that other macros get. However, this would make macros less composable, you couldn't as a macro import another macro and directly invoke it, because your types won't match up.

Yes, it's not clear what we could want here ... for me supporting splitting to a different package seems like it gives us equivalent freedom to introducing a new type of dep, without actually having to figure out what a new kind of dep should be/do.

Thanks.

jakemac53 commented 3 months ago

I think it does introduce complexity, though: if the implementation package is not a dep of the annotation and not a dep of the package using the annotation, how does the version solve work?

The macro package should always have a dependency on any annotations it will be reading. They 100% depend on the API of those classes and should express that dependency in their pubspec, even if they aren't literally importing the annotations packages.

Also, any package which has macro configuration like this should have a dependency on any macros/annotations it is setting up configuration for, to ensure they are present.

I think it's a possibility we should allow for; you can read about the problems Swift has because all macros must compile with one version of SwiftSyntax. Note that it's a problem even though Swift offers #if canImport, which we don't have an equivalent for.

We can avoid this problem in other ways though (for example shipping a protocol not an API, or "private" dependencies). This is also why I don't love the idea of shipping a complex AST structure that represents all possible const expressions.

I think that's fine? Macros can evolve their annotations carefully, and give careful version constraints implementation->annotation, which we need to respect when fetching the implementation. Because the macros access the annotation via introspection they can be compatible with a bit of effort over changes you might think would be breaking, like built_value is today.

I disagree that this is a reasonable/scalable approach. Macro authors should be able to version their annotations and macros together, without maintaining backwards compatibility. We have the tools available to do this and should use them. We also don't expose package versions to authors today.

Do you mean that the macro will generate code that uses a helper package, and maybe create imports onto it?

Right, a macro might want to generate code interacting with any API from any package. That means they will need to be able to impose version constraints on the users version solves.

That said I can't actually think of examples of such a helper package today--isn't it always that the helpers go next to the annotation?

Often, yes they would live in the annotation package, but that doesn't make a difference. Either way, the macro needs to impose a version constraint on the users version solve (in that case it's the annotation package, which they anyways should be constraining).

Consider a functional widget macro as another example, they will be generating an entire widget class, and if flutter used semver they would want to impose some constraints on the flutter package to ensure they were generating valid widgets.

davidmorgan commented 3 months ago

I think that's fine? Macros can evolve their annotations carefully, and give careful version constraints implementation->annotation, which we need to respect when fetching the implementation. Because the macros access the annotation via introspection they can be compatible with a bit of effort over changes you might think would be breaking, like built_value is today.

I disagree that this is a reasonable/scalable approach. Macro authors should be able to version their annotations and macros together, without maintaining backwards compatibility. We have the tools available to do this and should use them. We also don't expose package versions to authors today.

Yes, the implementation should be able to specify a range of versions--just like I can with built_value_generator onto built_value.

But I chose to take this approach because I wanted to offer a more flexible version solve to users.

It's not free for a macro author when they constrain their version solve--people will complain. So there is an incentive to support more versions :) ... and we should make that possible if we can.

Do you mean that the macro will generate code that uses a helper package, and maybe create imports onto it?

Right, a macro might want to generate code interacting with any API from any package. That means they will need to be able to impose version constraints on the users version solves.

That said I can't actually think of examples of such a helper package today--isn't it always that the helpers go next to the annotation?

Often, yes they would live in the annotation package, but that doesn't make a difference. Either way, the macro needs to impose a version constraint on the users version solve (in that case it's the annotation package, which they anyways should be constraining).

Consider a functional widget macro as another example, they will be generating an entire widget class, and if flutter used semver they would want to impose some constraints on the flutter package to ensure they were generating valid widgets.

I'm not sure that's an important use case; I can't think of any examples. Maybe it is.

Anyway, I think we don't lose anything by splitting the implementation out, because you can version constraint from the annotation package--which otherwise would be the annotation+implementation package, so the capability to constrain looks equivalent?

I think you said upthread that splitting implementation to a separate package is fine. If it's fine but we disagree on how important we think it will be, maybe we can still make progress ;) as long as the downsides you see can be addressed?

Thanks :)

jakemac53 commented 3 months ago

Yes, the implementation should be able to specify a range of versions--just like I can with built_value_generator onto built_value.

This is my point though, if macros have their own version solve, this isn't possible.

Anyway, I think we don't lose anything by splitting the implementation out, because you can version constraint from the annotation package--which otherwise would be the annotation+implementation package, so the capability to constrain looks equivalent?

Ok I think maybe it would be helpful to get a bit more concrete, I think I see what you are getting at, so here is a proposal based on what I think you are saying:

Does that make sense?

Basically, it would allow for different macros to have different package:macros versions, or other macro related dependencies. And their dependencies don't have to have a compatible version solve with other macros - only the regular runtime dependencies of the users app.

davidmorgan commented 3 months ago

Ok I think maybe it would be helpful to get a bit more concrete, I think I see what you are getting at, so here is a proposal based on what I think you are saying:

  • In the annotation package, add the macro config to the pubspec. This can point to a macro implementation in a different package.

Right.

  • Also in the annotation package, add dependencies and constraints for any code the macro will use.

Yes. It would be a bit nice to do something smarter with these, they're a special type of dependency that only becomes a dependency if something specific gets generated. But we have never figured out a good way to do that, so let's not worry about it :)

  • Add a new dependency section, macro_dependencies.

    • List the macro package constraints here. Any macro implementation listed in the macro config should be listed, but not anything else.

Conceptually yes, but I'm not sure it needs to be explicit. Can we just pull it from the macro config? That might include putting a version in the macro config if we think that's useful.

  • Macro dependencies get their own version solve, with these rules:

I would punt on this for now--just treat the implied or explicit macro_dependencies as normal dependencies and do one version solve as we do today. Then we only get into the details of the next bit with a clear motivating case :) ... with that in mind, my tentative thoughts on the rest:

* All their own regular dependency constraints must be met.

Right.

* All the users dependencies and dev_dependencies constraints must be met, including transitive regular dependencies.

Trying to reason through what this actually does ;)

I think the effect is that the version range of the annotation package compatible with the user deps (-macro deps) is determined, and then this version range is used to pick the macro implementation package version to use--or to fail if there isn't one.

Because it's a range I think it could still be the case that a macro implementation package version corresponding to a different annotation version than is actually used, could be picked? So possibly it should be stricter.

If the macro implementation package happens to "privately" use some dep shared with the user code, i.e. it's used during generation but not during the generated code, then I think there's no benefit to solving together. So possibly less strict for that case.

Maybe we can get both like this: version solve the user program - macro deps to get the exact version of the annotation package that will be used. Then version solve the macro implementation with the annotation package pinned to that exact version. The macro implementation promises to work based on the annotation package deps at some specified version being available--so give it the exact one and I think we're done? :)

* Importantly, the macro_dependencies are not included in this solve, except for the constraints on the exact macro being solved for.

Yes indeed.

  • Users just depend on the annotation package.

Right.

Does that make sense?

Basically, it would allow for different macros to have different package:macros versions, or other macro related dependencies. And their dependencies don't have to have a compatible version solve with other macros - only the regular runtime dependencies of the users app.

SGTM.

Thanks :)

jakemac53 commented 3 months ago

Conceptually yes, but I'm not sure it needs to be explicit. Can we just pull it from the macro config? That might include putting a version in the macro config if we think that's useful.

I do think it should be explicit. We have a way of expressing dependencies/constraints already, lets stick with it for consistency.

I would punt on this for now--just treat the implied or explicit macro_dependencies as normal dependencies and do one version solve as we do today. Then we only get into the details of the next bit with a clear motivating case :) ... with that in mind, my tentative thoughts on the rest:

I agree - and I wouldn't even add macro_dependencies today. If/when we decide to try and utilize this kind of dependency, we can ask macro authors to move their macro dependency into that section, and it should also work if they don't (it will just have a harder time getting a version solve). So, basically we can release an optimization feature later where they can move macro deps into a different section for easier version solves.

Maybe we can get both like this: version solve the user program - macro deps to get the exact version of the annotation package that will be used. Then version solve the macro implementation with the annotation package pinned to that exact version. The macro implementation promises to work based on the annotation package deps at some specified version being available--so give it the exact one and I think we're done? :)

This assumes macros only are coupled to the annotation package version, which isn't necessarily always the case. So, I think all the user-land constraints (minus macro-deps and their transitive deps) need to be taken into account.

Either that, or we require macro packages to distinguish between their kinds of deps - the ones they depend on for generation and the ones they depend on in generated code - but then we have to add a whole additional section.

I think the effect is that the version range of the annotation package compatible with the user deps (-macro deps) is determined, and then this version range is used to pick the macro implementation package version to use--or to fail if there isn't one.

I do worry about how much complexity this would add to version solves in general. These separate version solves end up having to interact with each other... it sounds complicated.

lrhn commented 3 months ago

If macro dependencies are just normal dependencies, it's not a restriction that the annotation and implementation must be exported by libraries in the same package. The implementation can be exported from another package.

It did sound like what we need is more of a binary dependency on the implementation. We don't need the source (much), and it doesn't have to be compiled with the same dependencies as the current package that it generates code for. It's More like the macro could be a bin/ executable that was pub global install'ed and run with it's own package resolution.

We could make it be that: An executable that takes a special first argument that tells it how to connect back to the compiler's macro service, which it must do using package:macro. There must still be a version constraint from the macro annotation package to the implementation binary, but it's a "binary" version constraint, it doesn't include all the transitive dependencies into the runtime dependencies of the current package.

Maybe a dependency kind like that could be useful for hook scripts too, which include code from other packages while building, but doesn't need that code in the built program. Not the same, though.

Maybe:

How to invoke a binary dependency is up to whoever needs to do it, and whether to use the same resolution as the depending package for packages that both packages depend on is a Pub choice, as long as a solution is found. What can be assumed is that there is a version of the package available satisfying the version constraint on that package, with a complete package resolution. The rest is just working on a normal Pub repository, or something optimized to be compatible with that.

davidmorgan commented 3 months ago

I would punt on this for now--just treat the implied or explicit macro_dependencies as normal dependencies and do one version solve as we do today. Then we only get into the details of the next bit with a clear motivating case :) ... with that in mind, my tentative thoughts on the rest:

I agree - and I wouldn't even add macro_dependencies today. If/when we decide to try and utilize this kind of dependency, we can ask macro authors to move their macro dependency into that section, and it should also work if they don't (it will just have a harder time getting a version solve). So, basically we can release an optimization feature later where they can move macro deps into a different section for easier version solves.

SGTM

Maybe we can get both like this: version solve the user program - macro deps to get the exact version of the annotation package that will be used. Then version solve the macro implementation with the annotation package pinned to that exact version. The macro implementation promises to work based on the annotation package deps at some specified version being available--so give it the exact one and I think we're done? :)

This assumes macros only are coupled to the annotation package version, which isn't necessarily always the case. So, I think all the user-land constraints (minus macro-deps and their transitive deps) need to be taken into account.

Either that, or we require macro packages to distinguish between their kinds of deps - the ones they depend on for generation and the ones they depend on in generated code - but then we have to add a whole additional section.

So what I was imagining was that, if you choose to split the implementation to a different package, the deps for generation go on the implementation package and the deps for generated code go on the annotation package.

If there is some generation-time dependency that all macros want to fix to the same version then all macros would have to add that dependency to their annotation packages, if they split to separate annotation/implementation.

In that way I think the macros are only coupled to the annotation package version? That's the intent.

Note that if all macros want the same version of a generation-time dependency then you don't get that by pulling in the macro user deps, because there is no reason for the macro user to depend on that generation-time dependency. It has to be that the macros say they want it in the shared version solve, which I envisage doing by having it be a dep of the annotation.

davidmorgan commented 3 months ago
  • a binary dependency is a dependency for client packages, but does not as dependencies on its dependencies, and did not provide its own content as package: accessible code. It can be resolved and compiled stand-alone, as long as it satisfies the version requirements of all the dependencies on it.

It might be interesting to refine this idea, indeed.

The natural dependency direction for macro annotation and impls is impl onto annotation, not annotation onto impl.

The annotation doesn't say "I am implemented by package:foo_macro_impl versions 1.3-1.7", because it doesn't know: it's the impl package versions that know which range of annotations they are compatible with, "I implement package:foo_macro versions 2.2-2.4". You can later release an implementation that supports a wider annotation version range, including earlier versions, if you like.

So I'm not sure if the annotation package actually needs a version onto the impl package, or if it's just always any. I did once add a dep built_value->built_value_generator because I screwed up a release and thought this was a clever fix, but there was some fallout (don't remember quite what) so apparently it was not so clever as I thought :)

davidmorgan commented 2 weeks ago

A minor question came up in discussion with Jake today: do we allow multiple macro implementations to be associated with one annotation; and if so, what order do they run in.

At first thought "yes" and "(if an ordering is needed) in the order they are mentioned in the yaml file / package config" seems reasonable, what do others think please? :)

lrhn commented 2 weeks ago

I would say no to multiple macros associated with one annotation.

It's not good style to associate macros with an annotation that you didn't create yourself. That's adding semantics to something that already has one. If it's your own annotations that you are adding multiple macro implementations to, you should just combine those into one.

(But you can associate the same macro implementation with multiple annotations, since the implementation gets to inspect the annotation.)

davidmorgan commented 2 weeks ago

Good point, thanks, it's anyway a goal that macros are easily to compose in code, there is perhaps no use case left for combining them declaratively.

Wdestroier commented 2 weeks ago

It's not good style to associate macros with an annotation that you didn't create yourself.

I want to give an example. Lets say we have a class annotated as an API controller.

@RestController('/v1/user')
class UserController {}

Assuming a fictional package generates code to register endpoints for this class, couldn't another package check the same annotation to register endpoints for an OpenAPI specification?

lrhn commented 2 weeks ago

It's definitely technically possible to have multiple macros triggering off the same macro annotation. I'm don't think it's desirable to trigger off other people's annotations. At all.

I really do not want someone to add a macro to @Deprecated() or, heck, @override. This is not monkey-patching, you don't get to apply macros to code that didn't ask to have macros applied. Macro annotations should come with an intent, and the user who applies a macro should understand what they get, which is what the package exposing the macro says it means. No more and no less.

If the second fictional package also want to introduce a Rest API, it should export its own RestController annotation, so that the user code can choose which of two syntactically compatible macro annotations it wants to import and use. If the code really wants to use both, it can import with prefixes and write @api1.RestController('/v1/user') @open_api.RestController('/v1/user') class UserController {}.

davidmorgan commented 2 weeks ago

Agreed.

If you like you can write a macro @AlsoGenerateOpenApiEndpoints() that you tag the library with that reads all @RestController in the library. That seems a pretty good compromise between control and brevity.

Wdestroier commented 2 weeks ago

you tag the library

Cool, thanks @davidmorgan and @lrhn.

jakemac53 commented 2 weeks ago

It seems to me that maybe annotations which trigger macros should have to be themselves annotated with something pointing at the macro then, and not the other way around (I don't actually recall which approach we were heading towards, and maybe this is already how it will work).

That would allow applying multiple macros with an obvious order, the order that the annotation lists them.