Open sluongng opened 3 weeks ago
Just to clarify: are we talking about build_event_stream.proto a.k.a. Build Event Protocol, or publish_build_event.proto a.k.a. Build Event Service?
Standardizing the latter (BES) might make sense. The former (BEP), I'm not convinced that's a good idea. The reason being that it exposes information in a schema that corresponds to Bazel's data model. For example, is it realistic to assume that Pants, Buck2, etc. etc. etc. all have the equivalent of a "ConvenienceSymlinksIdentified" event? I don't think so.
Agree. I don't think we want to make Bazel-specific events a standardized spec.
I think a good starting point would be a new event protocol that meets all the common needs of existing tools:
And leave an Any
field for different tools to implement domain-specific events. Overtime, we can identify common needs between tools (i.e. more than 2 tools interested in the same thing) to add more event types to the spec.
cc: @philwo @aherrmann @bergsieker who might be interested in this topic.
My concern is that if we attempt to standardize anything that is in excess of the Build Event Service, it would severely suffer from an inner-platform effect.
You might notice that even within Google we have (at least) two different interfaces for this. When we looked at standardizing them years ago, we found that BEP/BES didn't map well onto the Chromium build lifecycle. I don't recall the details, but certainly at least part of it was due to hierarchical builds, where one build initiates another, and you want to be able both to track them separately and to provide a rollup view. Both Bazel and Chrome had too much entrenched usage to make changing them realistic.
My gut feeling here is that BEP doesn't generalize well to other tools. BES might generalize but I'm not sure. However, Bazel is unlikely to move to a new protocol due to the significant infrastructure that we've built internally around BEP.
I'd suggest exploring what this looks like when built on top of an existing framework like Open Telemetry. It's possible there could be enough momentum from non-Bazel tools to get that off the ground, and leveraging existing open standards is good when possible.
Added ReClient's events to the issue's description.
I've just opened up a proposal to add BES (or something equivalent) to Buck2: https://github.com/facebook/buck2/pull/806
Today, among different matured build tool solutions there exists several build event protocols that enable build telemetry use cases:
On top of these, many build tools and CI systems in the wild have started adopting a more generic telemetry system (Open Telemetry, Prometheus) for their CI/CD telemetry needs:
So I want to start a discussion about a standardized Build Event Protocol so that different client and server implementations can agree on a common specification moving forward, and reduce overall fragmentation.
Please comment below if you are interested in adopting such a spec.