Open dcharkes opened 6 months ago
Let's see if I understand this correctly ...
The annotation marks an external
declaration which will be linked to some external "data" by the link step (build/link scripts?).
If the declaration is tree-shaken, then the link step can omit the corresponding data too, so it's tree-shaking external data by linking it to Dart declarations.
It works by the annotation having some metadata which is JSON-encoded and written to a some JSON file (before or after tree-shaking) where the build/link scripts can get the metadata and use it to build the requested external data. Probably includes the desired format too.
The naming of annotations is not as standardized as most other things. Yet?
If I see @ResourceReference(...)
I'd expect the annotated declaration to be a resource reference. Here it actually is the resource itself. (A property is its value.)
So @Resource(...)
or @Asset(...)
?
If I haven't understood things correctly, then ... something else!
Sorry for scattering information all around @lrhn!
We need some way of allowing users to tree-shake whole assets or parts of assets in link hooks.
For the simplest cases we can omit the whole resources.json
. For @Native('myAssetId') external
functions and, @Data('myOtherAssetId') Uint8List external
definitions (https://github.com/dart-lang/sdk/issues/54003#issuecomment-2128761001, https://github.com/dart-lang/sdk/issues/55809#issuecomment-2128779930) the Dart compiler can look at which assetId's are still reachable after tree shaking and then the bundler (dartdev or flutter tools), can omit all assets of which the assetIds are not reachable. (We could opt to have the @Native
and @Data
automatically be part of the resources.json as an implementation detail so that dartdev and flutter tools can use it.)
What we're trying to solve with "resource identifiers" (to be renamed) is more involved. The metadata
in the annotation, and the const argument values on the callsite to the static call, can both be used to compute inside the link hook (1) the assetId the information is relevant for, and (2) what part of that asset the information is relevant for. The link hook can then throw out parts of assets it decides are not used at runtime or whole assets that are not used at runtime.
For example for JNIgen when trying to implement proguard rules (https://github.com/dart-lang/sdk/issues/55568).
class JClassIdentifier implements /* to be renamed */ ResourceIdentifier {
final String originalJavaClassName;
const JClassIdentifier({
required this.originalJavaClassName;
});
}
@JClassIdentifier({
originalJavaClassName: "FooBar",
})
late final JClass foobar;
The link hook would then take such info and ensure the FooBar
Java class is still inside the .jar
if foobar
is used after tree-shaking, and throw out other Java classes. So in this case "FooBar" is a part of an asset, not a full asset. A "assetPartId" or something.
The naming of annotations is not as standardized as most other things. Yet?
Correct, that's why I'm filing issues. I'm offloading a bunch of whiteboard discussions to GitHub issues to do some extra rounds of bikeshed.. err naming.
If I see
@ResourceReference(...)
I'd expect the annotated declaration to be a resource reference. Here it actually is the resource itself. (A property is its value.)
Well, the annotated declaration uses an asset or a part of an asset at runtime. But the mapping from the const argument values and metadata in the annotation to asset id and asset part id happens in the link hook. The example above is a verbatim mapping from metadata to asset part id, but that is not always the case. For example an argument "dk"
might refer to asset ID: package:my_package/data/flags/dk.svg
or to asset id package:my_package/data/flags.zip
with asset part id `dk.svg.
So naming the annotation @UsesAsset(String? assetId, String? assetPartId)
doesn't work due to the fact that link hooks can transform the annotation data. (Crossed out that option in the initial post.)
Hence, I'm leaning more towards gearing the annotation name to the fact that the calls to this static function are recorded after treeshaking. E.g. @RecordCalls
. But since one can make a closure out of a static function, it is recording all references in the program to that static function, not just the static calls. Hence @RecordReferences
. (Sorry to completely throw you off @lrhn! The "reference" is a reference in the Dart program to the static function! Not a reference to the resource/asset! 🤣 🙈)
I think this shows that maybe we should have @TreeshakeInfo
as annotation?
@TreeshakeInfo(Object? metadata)
// With a doc comment that this will cause calls to be recorded after tree shaking and made available in link hooks.treeshake_info.json
as the suggested file name in docs (and what we use internally)package:treeshaking_info
which lives in the SDK in pkg/record_references and is imported by dart2js and pkg/vm and published unlisted (We can make listed if there ever is a use case where users want this for other purposes, then we can make dart compile js/exe
have a --treeshaking_info=...
.)gen_kernel --record_treeshaking_info=treeshaking_info.json ...
(or maybe just without "record").Ok, so the meaning of the annotation is that someone wants to know whether the annotated (static only? Function only?) declaration may be used in the program, and if so, how (the arguments passed at the user point if it's a function, and the use is a call, and only if the argument is a constant that can be JSON encoded) based on only the uses left after tree-shaking. (Is there an entry of there are no uses left, to distinguish it from no annotation at all?)
The compiler records the invocation of the annotated declaration, at least if it's a get or call. (Should probably support constructors, but if it only supports static functions, one can always forward. Does it record type arguments? Does it work with extension and extension type methods, which are not static declarations, but are resolved statically?)
What someone uses the information for is up to them, this is a general feature that can be used for anything a user can come up with and make a tool for.
The annotation can carry extra information, which is JSON encoded and stored with the use record. Again that can be completely general, and the consumer gets to decide what it means.
If that's correct, then @RecordUse(...)
sounds on-point. It's an imperative name, so an instruction that applies to the annotated declaration.
+1 for @RecordUse
, with the file being recorded_uses.dart
.
Thank you @lrhn, you make everything better!
Naming proposal:
@RecordUse(Object? metadata)
// With a doc comment that this will cause calls to be recorded after tree shaking and made available in link hooks.recorded_uses.json
as the suggested file name in docs (and what we use internally)package:recorded_uses
This has the parser and serializer for the format. It lives in the SDK in pkg/recorded_uses and is imported by dart2js and pkg/vm and published unlisted (We can make listed if there ever is a use case where users want this for other purposes, then we can make dart compile js/exe have a --treeshaking_info=....)gen_kernel --record_uses=recorded_uses.json ...
In the future we might want to add recording uses of other things (for example an annotation on a class, meaning that if any if the class survives tree shaking). But that still is valid to talk about recorded uses.
Is there an entry of there are no uses left, to distinguish it from no annotation at all?
No, we don't have info about things that are gone after tree shaking. Maybe we should. (Implementation: We should visit the tree before and after tree-shaking.) This would enable link hook writers to write their hooks as throwing things out instead of retaining things. I'm not sure how important this is though.
Should probably support constructors.
Does it record type arguments?
No, we could if they are const on the call site, identified by the library uri and class name. We can add this later if we find a use case.
Does it work with extension and extension type methods, which are not static declarations, but are resolved statically?
What someone uses the information for is up to them, this is a general feature that can be used for anything a user can come up with and make a tool for.
👍 We are on the same page!
Extensions yes:
Extension types, most probably not.
cc @rakudrama
After some discussion with @mosuem we feel like resource identifier is not a great name. Resource is kind of a synonym for asset.
Some options:
Naming linked to the language constructs:(We might want other treeshaking info than just calls to static methods)@RecordReference(Object? metadata)
recorded_references.jsonpackage:record_references which lives in the SDK in pkg/record_references and is imported by dart2js and pkg/vm and published unlistedgen_kernel --record_references=recorded_references.json ...
@RecordReference(Object? metadata)
gen_kernel --record_treeshaking_info=treeshaking_info.json ...
Naming linked to assets:(The link hook take the tree shaking info and know what assets it is related to.)@AssetReference(String? assetId, Object? metadata)
asset_references.jsonPackage:asset_references (in the SDK, and unlisted)The downside of the last option is that it is linked to the assets concept but that the compiler itself is just recording references so that's a bit weird. With it being possible to implement
RecordReference
(https://github.com/dart-lang/sdk/issues/55568), thenAssetReference
can be a subtype ofRecordReference
.With all cases
package:native_assets_cli
would not expose the types of the package but extension types around the package, because we'd only expose a subset of all the fields of the resource indentifiers.