dart-lang / native

Dart packages related to FFI and native assets bundling.
BSD 3-Clause "New" or "Revised" License
83 stars 27 forks source link

[native_assets_cli] Dart API interface per asset type #994

Open dcharkes opened 2 months ago

dcharkes commented 2 months ago

Make package:native_assets_cli only consume an API that shows getters for native code (and not any getters for Java or other asset types). This can be achieved by

  1. nesting NativeBuildConfig inside BuildConfig which doesn't work well with the shared fields such as outputDirectory, or
  2. BuildConfig implements NativeBuildConfig where only a subset of the getters is visible, or
  3. an extension type NativeBuildConfig on BuildConfig.

Make package:native_toolchain_c add assets to a NativeBuildOutput that doesn't have methdods/setters related to Java assets or data assets. This can be achieved by

  1. BuildOutput implements NativeBuildOutput and NativeBuildOutput.addAsset takes NativeCodeAsset instead of Asset.
  2. an extension type.

We could even have assetId be optional for some asset types (jars) in the API.

Question: Don't we ever have builders that would like to add more than one asset type? They would need to take the full BuildOutput.

Related:

Sister issue for the JSON protocol:

dcharkes commented 2 months ago

Nesting the assets inside an asset type in the API has consequences for how a link.dart is structured.

Having it nested means that older link.darts are not aware of new asset types (and will ignore them silently). Ignoring silently would be weird because we would specify that an asset is destined for a certain link script. Having one list of assets (with an asset type per asset) requires explicit switching in a link.dart script, which requires developers to deal explicitly with possible new asset types.

So, @mosuem and I believe it's better to have a single list of assets.

mkustermann commented 2 months ago

Having it nested means that older link.darts are not aware of new asset types (and will ignore them silently).

I don't understand this at all. One only uses link.dart for specific asset types

I'd view our system as a layered architecture:

It probably makes sense for there to be one linker per asset-API/asset-type (imagine C linker: it combines all the native code into one .so file, imagine localization messages: it combines the localizations from all packages into one big one). The pubspec version constraints on the linker can ensure the version of linker supports the version of the asset-API/asset-type.

One way to look at it is a map-reduce system: All emitted assets by build.dart (the map phase) are grouped by asset-API/asset-type and get their corresponding link.dart (the reduce phase) invoked. The link.dart (reducer) may only produce 1 asset but may also produce multiple.

dcharkes commented 2 months ago

It probably makes sense for there to be one linker per asset-API/asset-type

Conceptually yes, but the question is how to make this work nicely.

Suppose there are two packages that have a link.dart that wrap a C linker, or that know how to deal with some reusable localization format. If an app has transitively two packages that treat the same asset type, we get into some questions. E.g. do we just fail the build? How do we even know what asset types are supported by a linker. The link.dart and build.dart protocol is single invocation. So you'd have to send all asset types to all link.darts.

To avoid these issues, @mosuem and I thought it would make sense to have asset-types conceptually namespaced by package name. So instead of the asset-type determining to which link.dart a to-be-linked-asset is send, we'd declare it in the protocol with the package name:

# build_output.yaml/json
assets:
  - # immediately bundled
assets_for_linking:
  native_toolchain_c:
    - # an asset being sent to native_toolchain_c tool/link.dart for linking

The downside of namespacing asset types with package names is that we can't really do drop-in-replacements of linkers. E.g. if someone comes up with a better JSON minifier, every build.dart outputting json's would need to update to send their assets to be linked to the new and shinier link.dart of the new package.

So from a map-reduce point of view:

  1. does the build.dart output declare to which link.dart an asset is sent, or
  2. does the build.dart just output some key, all assets are sent to all link.darts, and link.darts should ignore assets that are not their own asset type, and things go horribly wrong when two link.darts consume the same asset type.

Map reduce works with the first approach, if I understand correctly. The comment was written with assuming this approach.

If we both have a concept of targetLinker: <package_name> and Asset.type the it could be that someone sends an asset of some asset type to a linker, and that linker doesn't know about that asset type at all. That was what my comment was about. Does that make sense?

mkustermann commented 2 months ago

all assets are sent to all link.darts, and link.darts should ignore assets that are not their own asset type, and things go horribly wrong when two link.darts consume the same asset type.

Definitely not.

There's multiple options:

One could do a combination:

dcharkes commented 2 months ago

I like the combination option.

I'd need to spend a bit more time thinking about some of the specifics.

But in general I think this a good approach.

For our first use cases, I think the build.dart-specified-linker suffices. And then we can later extend it.

(Side note: These considerations are more for https://github.com/dart-lang/native/issues/153. Not really what this issue was about.)

mkustermann commented 2 months ago

If we have temporary asset types (e.g. a .o file or something, it must be consumed by a linker, it cannot be not be linked.)

On the lowest level each bundling & runtime system (flutter and dart) will have a fixed set of asset-APIs it supports. So if

An interesting thought experiment would be to see how one could make custom asset-APIs that neither Dart / Flutter know about which then get lowered to the ones that the bundling tool support:

If we can make this work we have a general mechanism that

dcharkes commented 2 months ago
  • we're in JIT mode and not linking (?) all assets emitted by build.dart need to be of one of the fixed types => So the bundling tool will issue an error if there's any emitted assets that we don't support

I was thinking we would execute link.dart scripts in JIT mode, but it would not have the AOT-treeshaking information.

  • we're in AOT mode and perform linking we may allow build.dart to emit an extended set of assets (or arbitrary assets -e.g. with mime type?) but expect the emitted assets of link.dart to be of the fixed set that's supported by the bundling tool => So the bundling tool will issue an error in link phase if there's any emitted assets that we don't support.

Yes that's the idea.

Now that we have asupportedAssetTypes in the BuildConfig (and LinkConfig), we can even support a different set of asset types whether we're in JIT or AOT. We'd just emit a different list in the BuildConfig.

An interesting thought experiment would be to see how one could make custom asset-APIs that neither Dart / Flutter know about which then get lowered to the ones that the bundling tool support: [...]

I think it would make it simpler if we always run the linking step so that this package would always emit the same format. Then it's runtime doesn't have to branch on JIT/AOT.

(Side note: This sounds exactly like the use case mentioned in https://github.com/flutter/flutter/issues/143348.)

If we can make this work we have a general mechanism that

  • allows packages to define asset APIs
  • allow the building/linking to use user-defined asset kind & transformations that lower to the APIs we have in dart/flutter
  • allows the runtime system of the package to use the lower-level APIs we have in dart/flutter to load assets for the higher-level concept of their package

Yep, that's the idea! 👌

mkustermann commented 2 months ago

I was thinking we would execute link.dart scripts in JIT mode, but it would not have the AOT-treeshaking information.

For some things no linking will be needed (e.g. readily available .so file, just include a file that can be accessed at runtime) So at least for those asset kinds for which no linker was specified (neither at per-package, per-app or built tool level) no linking needed. Then there's the question whether there's valid use cases where a linking step is required when a) we don't have tree shaking information b) we want to run app as fast as possible (development cycle) and not "optimize" any assets. Do we have valid use cases for this?

(Side note: This sounds exactly like the use case mentioned in https://github.com/flutter/flutter/issues/143348.)

Yes. Stay tuned about this - working on that part!

dcharkes commented 2 months ago

Then there's the question whether there's valid use cases where a linking step is required when a) we don't have tree shaking information b) we want to run app as fast as possible (development cycle) and not "optimize" any assets. Do we have valid use cases for this?

I'm thinking that it's a required step for the svg compiler mentioned in that issue.

cc @mosuem all the above thoughts.

mkustermann commented 2 months ago

I'm thinking that it's a required step for the svg compiler mentioned in that issue.

Svgs can be parsed & displayed at runtime or can be pre-processed to something else (e.g. a bunch of triangles with shading information - which may take long time) and that something else can be loaded & displayed.

Also the build.dart can do the svg processing as well, you don't need a linker step to do it.

We may want to communicate to build.dart whether we're in development mode or not (which we indirectly also do e.g. if we tell it to produce .so files or static library .a files).

If there's a real need we can of course support running the linking in development mode as well, I just fear that it may be misused to do a lot of work where it will harm development cycle.

dcharkes commented 2 months ago

Also the build.dart can do the svg processing as well, you don't need a linker step to do it.

That requires the build.dart of the user app to invoke some compilation from package:vector_image's dart API. Instead of having package:vector_image having a link.dart that processes all of them. And that would the only work for SVGs from the root package. If you have a helper package, that helper package would need to decided whether it compiles the SVGs themselves (preventing any tree-shaking) or whether it outputs them to be linked. How did you envision having build.dart doing it in such context?

We may want to communicate to build.dart whether we're in development mode or not (which we indirectly also do e.g. if we tell it to produce .so files or static library .a files).

BuildMode.debug?

(We currently don't have a concept of develop vs release in Dart standalone. Should all JIT be considered development mode?)

If there's a real need we can of course support running the linking in development mode as well, I just fear that it may be misused to do a lot of work where it will harm development cycle.

Hm, that's indeed something to consider.

mkustermann commented 2 months ago

That requires the build.dart of the user app to invoke some compilation from package:vector_image's dart API. Instead of having package:vector_image having a link.dart that processes all of them. And that would the only work for SVGs from the root package. If you have a helper package, that helper package would need to decided whether it compiles the SVGs themselves (preventing any tree-shaking) or whether it outputs them to be linked. How did you envision having build.dart doing it in such context?

Somewhat as described above: If I have svgs in my package, then I need to tell the system my package needs those svgs:

// hooks/build.dart
import 'package:svg_cli_build/svg_cli_build.dart';

main(args) async {
  await runBuild((config, output) {
     SvgBuilder('package:mypackage', ['icons/a.svg', 'icons/b.svg']).build(config, output);
  });
}

In my package (doesn't have to be root package) I then do

// package:foowidget/foowidget.dart
import 'package:svg/svg.dart';

class FooWidget {
  ... = loadSvgApi('package:mypackage', 'icons/a.svg');
}

Now package:svg_cli_builds SvgBuilder may

Now package:svgs loadSvgApi

i.e. we have a higher-level concepts

that under the hood rely on lower level things supported by dart/flutter build/bundle/runtime.

In some sense this is very natural: The package that knows how to e.g. compile C code probably also knows how to link it. The package that provides a intl/l18n API probably knows how to tree shake the intl/l18n files. So it can have a package for the compile-time component (build/link) and one for runtime - they can possibly even be the same.

dcharkes commented 1 week ago

We could even have assetId be optional for some asset types (jars) in the API.

Currently, Asset has a non-nullable assetId. Which makes sense for data assets and native code assets as they both are accessed from Dart code through asset id. It's unlikely that we would access Jar assets via an asset id ever. So we might want move assetId into code asset and data asset. (Or we make id optional, like file already is.)