dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.
https://dart.dev
BSD 3-Clause "New" or "Revised" License
10.11k stars 1.57k forks source link

Macros: Static types are broken #55880

Open simolus3 opened 4 months ago

simolus3 commented 4 months ago

I'm happy to work on fixing this myself, I just wanted to get some design feedback first. cc @jakemac53

To represent static types to macros, the following classes are used:

That's already a lot of classes! I was working on expanding static types to include the rest of the type hierarchy, which requires distinct interfaces for nullable types, void, dynamic, Never as well as function (bonus: type parameter types) and record types. With the current design, that requires 35 additional type representations!

I've also realized that with the current way types are implemented in the CFE and the analyzer, StaticType subtypes all get erased into a generic StaticTypeImpl instance after serialization. The reason is that named types are implemented as class _InterfaceTypeImpl extends _StaticTypeImpl implements macro.NamedStaticTypeImpl. However, the class should really extend NamedStaticTypeImpl, otherwise the correct serializeUncached / get kind methods aren't overridden. It can't extend both the common class and NamedStaticTypeImpl of course, and mixins are awkward to use here too because both classes have multiple fields. This affects both analyzer and CFE and results in NamedStaticTypes not ever showing up in macros at the moment.

A possible solution that avoids some of the duplicate logic in analyzer/CFE would be to separate data and behavior in type implementations: They could have a single type implementation that transforms their internal representation into a macro representation but doesn't require multiple classes. So something like this in the end (for the analyzer):

class _StaticTypeImpl extends macro.UniformStaticTypeImpl {
  final DartType type;

  _StaticTypeImpl(super.id, this.type);

  @override
  macro.StaticTypeImpl get representation {
    return switch (type) {
      DynamicType() => macro.DynamicTypeImpl(id),
      InterfaceType() => macro.NamedStaticTypeImpl(id, declaration: ..., typeArguments: [...]),
      ...
    }
  }

  Future<_StaticTypeImpl> asInstanceOf(...) {...}
  // other methods
}

UniformStaticTypeImpl would then delegate serializeUncached and get kind to the actual representation while also keeping its own id, so that calls to StaticType methods like asInstanceOf and isSubtypeOf would be delegated to the right _StaticTypeImpl class. That way, we have a single type implementation in CFE/analyzer that actually gets mapped to the correct interface when serialized to macros. This is also closer to what we're doing for other kinds of declarations (which are mapped instead of re-implemented by the macro embedders), except that types also have methods on their own.

jakemac53 commented 4 months ago

which requires distinct interfaces for nullable types, void, dynamic, Never as well as function (bonus: type parameter types) and record types.

I am not sure that we need all of these, we do need one for function and record types. But void/dynamic/Never could probably all just be regular StaticTypes? They don't have extra information that we want to be able to extract from them, afaik.

They could have a single type implementation that transforms their internal representation into a macro representation but doesn't require multiple classes. So something like this in the end (for the analyzer):

That idea seems like it generally makes sense to me, to simplify the analyzer/cfe classes.

Alternative: client side static types

Another option here is implementing all of the static type logic client side, which would avoid the async calls, and most of the classes. The cost of course being that we have to re-implement all that logic, which can get complicated, especially with generics etc. But for macros that want to do lots of type checks, it would be a lot more efficient. And it would be pure client side logic, static types wouldn't even exist on the analyzer/cfe side of things. Future changes to this logic wouldn't need analyzer/cfe changes, so there would be fewer packages to publish and no issue with flutter engine->framework rolls.

@davidmorgan is also looking into more of a pure data model approach, which doesn't fit well with the current static type system, and would probably require a client side implementation.

Side note: We do want to avoid too much churn in these APIs right now. Each time we change the protocol there is a decent bit of work involved, 3 packages to publish, and manual work for the flutter engine -> flutter framework roll. We haven't really figured out a good process yet.

simolus3 commented 4 months ago

Having client-side static types that are just descriptions sounds exciting and I'd love to give it a go if this is not being worked on yet. Another benefit is that we can go from TypeAnnotation to StaticType synchronously if we know we have a valid instance.

Each time we change the protocol there is a decent bit of work involved, 3 packages to publish, and manual work for the flutter engine -> flutter framework roll.

That also makes it interesting to move more of the logic into the client, especially since we'll likely need to expose most of the information required for subtype checks to clients either way. I have some more static type ideas I need for the macros I have in mind (like resolving the StaticType for ExpressionCode arguments passed to macros or going from St aticType to TypeAnnotationCode instances), would it help if I open them as dependent CLs?

jakemac53 commented 4 months ago

Having client-side static types that are just descriptions sounds exciting and I'd love to give it a go if this is not being worked on yet. Another benefit is that we can go from TypeAnnotation to StaticType synchronously if we know we have a valid instance.

I am not sure we could do this and maintain synchronous APIs for the StaticType operations - we potentially the entire transitive type hierarchy to be able to answer the questions people want to answer.

That also makes it interesting to move more of the logic into the client, especially since we'll likely need to expose most of the information required for subtype checks to clients either way.

Yes, for sure.

I have some more static type ideas I need for the macros I have in mind (like resolving the StaticType for ExpressionCode arguments passed to macros or going from StaticType to TypeAnnotationCode instances), would it help if I open them as dependent CLs?

Dependent CLs which we can land together and do one release would be preferable, yes.

I can't promise how quickly I can get to reviewing these CLs. We are in the midst of a lot of somewhat more existential macro related questions that are higher priority than fleshing out these specific APIs. And potential changes as a result of that could also introduce more churn to StaticType itself also.

So, I am somewhat hesitant to land any changes at this moment because it might just introduce more migration work later, etc.

However, moving to a client side only model for this is likely more in line with the potential future direction here, and also generally should require less churn, so it is appealing regardless.

simolus3 commented 4 months ago

we potentially the entire transitive type hierarchy to be able to answer the questions people want to answer.

We get all superinterfaces from a class declaration it seems (nevermind, only identifiers), but I think we have to pre-load some identifiers either way (e.g. the subtype rules require checking against known types like Null in some cases, so we have to have that resolved type around if we want synchronous subtype checking).

If we have a client-side model, I think it would be helpful if users were able to construct their own static types from parts (e.g. if I have a bunch of resolved StaticType instances around I can craft a StaticRecordType out of them). That breaks if we have the resolved identifiers required for subtype checks associated with every type. We could go with an API similar to the analyzer and perhaps introduce StaticTypeSystem as a new interface that has isSubtype(StaticType a, StaticType b) as methods on it. Then StaticTypes can be purely descriptive and only know things about themselves and there could be an (async) method on TypePhaseIntrospector to obtain a StaticTypeSystem. That makes it clear where the loading happens and all the type algebra stuff can be synchronous as the relevant identifiers for Null and so on have already been resolved.

jakemac53 commented 4 months ago

~We get all superinterfaces from a class declaration it seems~ (nevermind, only identifiers)

These are only the immediate interfaces/mixins/superclass - the stuff syntactically visible on the class declaration.

I think that constructing a static type is always going to end up being an async operation in order to get the transitive type information.

If we have a client-side model, I think it would be helpful if users were able to construct their own static types from parts (e.g. if I have a bunch of resolved StaticType instances around I can craft a StaticRecordType out of them). That breaks if we have the resolved identifiers required for subtype checks associated with every type.

Why does it break?

We could go with an API similar to the analyzer and perhaps introduce StaticTypeSystem as a new interface that has isSubtype(StaticType a, StaticType b) as methods on it.

That does make sense if we need to load additional types.

simolus3 commented 3 months ago

I've started with a CL moving the subtype logic into the _macros package, so that the analyzer/CFE would only have to map their type model into macro types (similar to what we're doing for declarations): https://dart-review.googlesource.com/c/sdk/+/372760. It's not complete yet, but it would be helpful to get feedback on whether the general direction/API matches what we'd expect from a macro-side type implementation.

simolus3 commented 2 months ago

@jakemac53 Friendly ping :) Did you have a chance to look at the CL yet or is there anything else I can do to help with this?

jakemac53 commented 2 months ago

I was waiting to look until we have a better idea what the new API for introspection is going to look like, cc @davidmorgan might be the best to take a look actually.

I do think we would want to hold off on landing it until we know how its going to fit into the longer term picture.

simolus3 commented 2 months ago

@davidmorgan, could you take a look at this CL (or at least the API changes in api/builders.dart and api/introspection.dart)?

I'll need fairly complete model of Dart's type system in my macros, so I'd love to help with the effort of moving more parts of this into the client. So knowing whether that CL aligns with your ideas or if there's anything else I can help with here would be good.

davidmorgan commented 1 month ago

Thanks @simolus3! Sorry for the slow response, I've been on vacation.

We are working on a new package that macros will be based on that will live outside the SDK as a normal package

https://github.com/dart-lang/macros/tree/main/pkgs/dart_model

and I think the client-side types functionality belongs there.

Anything that needs to be sent host<->macro will end up in the JSON schema

https://github.com/dart-lang/macros/blob/main/schemas/dart_model.schema.json

from which we generate Dart code by running dart tool/dart_model_generator/bin/main.dart in the base of the repo, it generates

https://github.com/dart-lang/macros/blob/main/pkgs/dart_model/lib/src/dart_model.g.dart

and then we can add code to those with extensions in

https://github.com/dart-lang/macros/blob/main/pkgs/dart_model/lib/src/dart_model.dart

... this effort is still relatively new, but is picking up speed. It's currently fully uncoupled from the SDK so we can relatively easily experiment, including with your code if you want to try making it fit here. Alternatively you may prefer to wait until it's more mature, or for one of us to try to incorporate it--please let us know!

The JSON schema and classes in dart_model don't really deal with types yet, QualifiedName is a placeholder that can refer to a named type in a specified package, obviously quite a lot more complexity is needed :)

simolus3 commented 1 month ago

so we can relatively easily experiment, including with your code if you want to try making it fit here. Alternatively you may prefer to wait until it's more mature, or for one of us to try to incorporate it--please let us know!

Thanks for the repose, I'll try to add my type changes into those packages. I've immediately hit some roadblocks but I've opened issues for them on that repository :)