facebook / buck2

Build system, successor to Buck
https://buck2.build/
Apache License 2.0
3.33k stars 194 forks source link

How to use c/aquery to find the dependents of an anon_target #615

Open dmezh opened 1 month ago

dmezh commented 1 month ago

Given a situation where I have multiple named non-anon targets depending on the same anon_target, is there a way I can find what all those non-anon targets are? I think what I'm looking for are the rdeps of some covering target universe over those named targets on that anon_target. But whenever I try to use an anon target in a query, like something like this:

buck2 cquery "rdeps(//program:lint, anon//:lint_report_single@15ca92b70428afb8)"

buck2 says:

Error evaluating expression:
    rdeps(//program:lin<<omitted>>nt_report_single@15ca92b70428afb8)
    ^--------------------------------------------------------------^

Caused by:
    0: Invalid relative target pattern `anon//:lint_report_single@15ca92b70428afb8` is not allowed
    1: unknown cell alias: `anon`. In cell `root`, known aliases are: ...
JakobDegen commented 1 month ago

I've needed this too in the past and haven't found a good solution. Unfortunately, there's none that's super obvious to me, neither of cquery or aquery is really the appropriate tool here. I'll bring it up in the team's core sync meeting on Monday and see if anyone has any ideas

JakobDegen commented 1 month ago

From the discussion yesterday: We weren't really sure what to do. There was some talk of adding analysis-query for this, but that seemed a bit overkill, since that's mostly identical to cquery and this is more or less the only use case.

The best idea so far is a buck2 audit analysis-graph :target, which would dump the entire analysis graph (id and deps of each node) of the target. That sounds pretty reasonable and wouldn't be too hard to implement I think

dmezh commented 1 month ago

Thanks for discussing on your side.

I am in favor of a potential analysis-query. I think our use case is probably different from Meta's in that we often find ourselves making many (dozens) of distinct binaries from slightly different sets of objects but with significant sharing. One use case today is building ~15 firmware binaries with maybe 70% of files shared on average for a ~100 translation unit binary.

This means the naive build without anonymous targets has 1500 .o actions. The anonymous targets build today is more like ~500. Then we also run e.g. several linters over those source files, which we also do anonymously and delivers similar extreme gains.

We are still working on scaling this but it's not difficult for me to imagine this same average ratio (70%) holding but for other bigger projects with ~10^2 configurations and O(10^2) source files.

A full, capable query flavor would be far and away the best option in my mind here. It's not so much a rare debug event for this usage pattern as a common daily occurrence.

dmezh commented 1 week ago

@JakobDegen Wondering if you have any more thoughts on the anon_target query stuff

JakobDegen commented 1 day ago

Hey, sorry.

I am in favor of a potential analysis-query. I think our use case is probably different from Meta's in that we often find ourselves making many (dozens) of distinct binaries from slightly different sets of objects but with significant sharing

I think, from the perspective of the design of how we would like buck2 to be designed and used, we would prefer that you didn't do this. Now, don't get me wrong, I understand why you're doing it and it probably is the right tradeoff among the options in front of you, but from a slightly more removed perspective, I would prefer putting effort into making that kind of pattern unnecessary, as opposed to putting effort into supporting it better.

That being said, I don't actually think that analysis query is a bad idea, and I think it fits into the model fine. I don't really expect anyone on our side to pick up work in this direction any time soon, but I think we'd accept a PR for either analysis query or something audit analysis-graph like.

dmezh commented 1 day ago

Hey,

I think, from the perspective of the design of how we would like buck2 to be designed and used, we would prefer that you didn't do this. Now, don't get me wrong, I understand why you're doing it and it probably is the right tradeoff among the options in front of you, but from a slightly more removed perspective, I would prefer putting effort into making that kind of pattern unnecessary, as opposed to putting effort into supporting it better.

Which part I guess do you mean you prefer would be different? Splitting that up into two:

First - the dozens (make it hundreds for some projects) of distinct binaries is a project requirement rather than a conscious choice on how to structure a build. The use case is building firmware binaries for many different configurations of the target and the artifacts themselves really do have to be hundreds of largely similar but ultimately different programs.

Please correct me if I'm saying something really stupid somewhere but:

Second - sharing objects between the configs is a conscious choice on how to structure the build to save build time. Ultimately though they really are the same action and with something like https://github.com/facebook/buck2/issues/611, they could be deduplicated on RE instead of being anonymous targets.