Closed filiph closed 7 years ago
There is no AggregateTransformer
option in build today, but there isn't any technical reason it couldn't exist.
The primary reason it doesn't exist is because its usage was heavily abused in transformers in ways that caused extreme slowness, and I hadn't seen a valid use case up until this point (in practice, plenty in theory).
People tend to use this any time they need to read more than one file (not realizing they could use the Transform
object to read additional files, or maybe they found it easier?), and would wrap up all codegen for an entire package in one single step instead of treating each entry point individually.
We could achieve the same functionality through two different features, a Builder
with no specified entry point file (we should do this anyways I think) and an api that is equivalent to Directory#list
. The internal AssetReader
interface already has this directory listing capability, and we could implement that in the barback world using an AggregateTransformer
. It would not be able to list your dependencies files, but that might be ok. This would have to be opt in, at least in the transformer world, as we don't want all wrapped builders to be aggregate transformers. WDYT?
cc @natebosch @kevmoo
What would be the api for configuring this? Would there be a BuildStep per directory? A single BuildStep for the entire build?
What would be the api for configuring this? Would there be a BuildStep per directory? A single BuildStep for the entire build?
Definitely up for debate, but one option would be a different type of Builder
, whose basic api could look something like this:
// better name tbd
abstract class NoInputBuilder {
// Possibly supply the root package name somehow?
Iterable<AssetId> get outputs;
// NoInputBuildStep is a BuildStep with no `inputId`. It should probably have
// a `package` getter though (the current package being ran on).
Future build(NoInputBuildStep buildStep);
}
This would mean setting up these build steps globally as part of the build, as they have fixed outputs.
So for the example that prompted this issue you would have something like the following:
class SubclassCollectorBuilder implements NoInputBuilder {
@override
final outputs = [new AssetId('my_package', 'test/all_subclasses.json')];
@override
Future build(NoInputBuildStep buildStep) {
// Lists all files under `buildStep.package`.
var files = await buildStep.list(globs: ['lib/**/*.dart']);
// Do whatever you want with those files, then use normal write apis.
// Open question: How to get a resolver?
}
}
If we're ok with a BuildStep per package I think we can make that work.
I wonder how much we can reuse existing interfaces rather than create parallel families. The NoInputBuildStep
could be a BuildStep
where the inputId
is an asset with a package
but no path
.
The Builder
class should probably be separate.
If we're ok with a BuildStep per package I think we can make that work.
Ya, this doesn't really shield users from the same pitfalls as AggregateTransformer
unfortunately. I do think there are valid use cases for Builder
s which don't have a single primary input though.
I wonder how much we can reuse existing interfaces rather than create parallel families. The NoInputBuildStep could be a BuildStep where the inputId is an asset with a package but no path.
I thought about this as well but I think its going to be more error prone. It would also really suck to have path
be nullable if we get non-nullable types in the future. Pretty much all code today assumes a non-null path for any AssetId
.
That being said I hate the NoInputBuildStep
name, I am just drawing a blank on a better one right now. Maybe PackageBuildStep
(and PackageBuilder
)?
/me chimes in
I think the confusing part for users right now is "how do I read and resolve multiple files to build a single output". It would be nice for our solution to also help here so we don't have to introduce yet-another type or breaking change in the near future.
@matanlurey Yes this same solution can solve that generally, we just have to figure out the api for the resolve
part but that is really only an api question.
Sidenote @filiph: Are all your subclasses you care about reachable from your main entry point dart file in lib
? If so you could could just have that be your primary input, and then you just resolve the world from there and crawl for all the subclasses you care about.
I think we could also add globs and/or directory listing capabilities on the existing BuildStep
if that would help.
The main question is whether the builder can be modeled as <input file> -> <output>
or if it needs to be <package> -> <output>
I think we could also add globs and/or directory listing capabilities on the existing BuildStep if that would help.
Yes, I wasn't super clear about that but I was proposing adding that to both BuildStep
impls. The main issue with that though is that we need to have two different BuilderTransformer
wrappers, one which supports that (via AggregateTransformer
) and one which doesn't.
@jakemac53: Not originally, but it's a valid approach, yes.
To give more context: I've had the following happen to me many times, and not only in Dart:
I've seen that when this is done automatically (or when the omission is reported as error early) there is a productivity boost.
The approach with a single entry-point is valid but only useful if there is file X that includes all members and I don't forget to add the new member there (to file X). In reality, I often forget even that.
The approach with a single entry-point is valid but only useful if there is file X that includes all members and I don't forget to add the new member there (to file X). In reality, I often forget even that.
@filiph that is a valid point. Do you have an opinion about an api that lets you list directories versus something closer to the existing aggregate transformer api?
There are 2 parts to this:
I think we're in agreement that we want a glob or directory listing api on BuildStep which handles the second point.
The question I still have is the first point. Would it be acceptable to have a file as an 'anchor' for the result?
Here's a straw man: I have a directory lib/implementations/
which has a variable number of files. I want to create a single file lib/all_implementations.dart
which has an entry for each file under lib/implementations/
or each class
within all of those libraries.
Is it OK for us to require the (empty) file lib/all_implementations.placeholder
and then write a Builder
which takes .placeholder
as input and creates .dart
as output, and in the middle reads all the files under lib/implementations/
. It won't actually read it's input, just use it to know that an output must exist there.
Is it OK for use to require the (empty) file lib/all_implementations.placeholder and then write a Builder which takes .placeholder as input and creates .dart as output, and in the middle reads all the files under lib/implementations/. It won't actually read it's input, just use it to know that an output must exist there.
I don't think we should require that, it does have the advantage of simplifying the api surface area but its also a pretty major annoyance for developers I would think.
After some offline discussion: It would be expensive to have a new variant of Builder and all the associated config and machinery. We know we want to support the directory listing stuff either way - we'll do that and document the patterns for using the placeholder file pattern. After that we can get some feedback on the pattern and reevaluate whether we want a way to handle it without the placeholder. Being able to do it without the placeholder might be the nicer UX, but we aren't sure yet if it's worth the cost.
FWIW, my initial expectation was that I'll get a build step that has "statically" specified glob of inputs (so, not globbing inside build() but before that, in Phases). But I may be biased from barback and blaze.
I'm of course okay with anything that works.
I think having better documentation and examples is likely going to be the story here looking at the work required - we should show an example of converting an aggregate transformer as part of the docs!
One thing I hadn't realized but came up in gitter while discussing sass_builder
- the placeholder file + directory listing APIs don't help use with picking up changed files.
We probably will need to have an AggregatBuilder api so that multiple files can trigger the rebuild
@natebosch How will this work with bazel? Isn't that already the case?
Yes I think with bazel we could make it work without AggregateBuilder
since bazel is deciding when to run and the other files would be srcs
I suppose we could also look at build_runner
and see if we can make it smarter about these things.
@jakemac53 any thoughts?
would it be possible to add inputs
field to the Builder
or to create an AggreateBuilder
that extends Builder
, something like:
abstract class AggregateBuilder extends Builder {
Set<AssetId> inputs;
}
then in https://github.com/dart-lang/build/blob/master/build_runner/lib/src/generate/build_impl.dart#L349 you could add:
if(action.builder is AggregateBuilder) action.builder.inputs = inputs;
then Inside my AggreateBuilder
I just would check if any of the inputs
could be treated as mainInput
and then process it. So in sass_builder
I would do something like:
class SassBuilder extends AggreateBuilder {
@override
Future build(BuildStep buildStep) async {
var inputId = buildStep.inputId;
if (basename(inputId.path).startsWith('_')) {
inputs.where((input) => !basename(input.path).startsWith('_')).forEach((input) {
var css = await render(input.path, packageResolver: await PackageResolver.current.asSync);
buildStep.writeAsString(_changeExtension(inputId), css);
});
return;
}
var css = await render(inputId.path, packageResolver: await PackageResolver.current.asSync);
buildStep.writeAsString(_changeExtension(inputId), css);
}
...
One thing I hadn't realized but came up in gitter while discussing sass_builder - the placeholder file + directory listing APIs don't help use with picking up changed files.
We probably will need to have an AggregatBuilder api so that multiple files can trigger the rebuild
This shouldn't be an issue, but we would need to add a notion of entire directory input nodes to the asset graph which shouldn't be too challenging? We could also integrate glob support to help you trim down the dependencies.
Essentially the additional logic would be the following:
When you list a directory (with a glob or not):
directory
asset node in the graph and add all outputs of the current action as outputs of the directory node (same logic as if you read additional files using the buildStep
, just a different type of node).When a file is created or deleted:
When a file is edited there is no extra work required.
We definitely want to do the glob api so I filed a separate issue to track - the remaining points around placeholder files etc are still unsettled.
Answered my SO question with a full example using a placeholder file:
Closing as the remaining issues with placeholders is a dupe of https://github.com/dart-lang/build/issues/421
Just to clarify: is there currently still no plan to introduce an AggregateBuilder
? Should we consider the placeholder file workaround a final solution (for the foreseeable future)?
I'm fine with that, just checking that I'm not missing some context.
@filiph Yup, we went with a more generic approach that will work with all build systems.
You can see a decent example of this expected behavior here: https://github.com/dart-lang/build/commit/65b08f6d5710ed78389bf1db74c5dc415e1b4819
We definitely could use better user-documentation on creating these types of builders, though.
@matanlurey So, if I’m understanding this correctly, the approach is to wrap the inputs in a regex, surrounded by $?
@matanlurey So, if I’m understanding this correctly, the approach is to wrap the inputs in a regex, surrounded by $?
Not quite - there are special synthetic assets in a few top level directories, which are only there for you to use as placeholders for these types of builders. These look like lib/$lib$
, web/$web$
, etc (that is the full path).
To glob your sources you would use the findAssets
api on BuildStep
, you simply ignore the primary input in this case (they are not readable).
Note that this is not actually the solution described in previous comments, its a simpler solution that we landed on after some offline conversations. In practice it actually seems to work out quite well.
It's not clear to me how to generate one output file that depends on many input files.
Let's say I want to generate a file that includes names of all subclasses of class X in a directory. *) These subclasses can live in various files in my
/lib
but I only want to build one JSON file.It's not clear how to do this.
BuildStep
has only oneinputId
and while there'shasInputs
, there is no way to get all of them (or a subset).Builder
only ever gets oneBuildStep
.In Transformer world, there's AggregateTransformer. If there's a similar thing in Build then it's not well documented.
*) Why would I want to do this? For easier testing, for example. When I implement a new subclass I want to make sure it's used in every place where it should be used as soon as I run unit tests (or sooner, via static errors). I use this in Python and it's really, really effective.