Open TheGrizzlyDev opened 1 month ago
So I had kind of had this thought on a previous PR, but I think now might be the right time to bring it up:
It might be good to simplify the RE configuration and setup situation by making buck2 aware of different RE providers. Specifically what I have in mind here is a concept of "profiles," ie a buck2_re_client.provider_profile
that could be set to "engflow," "buildbarn," or left as some default variant. This profile would primarily affect the choice of defaults for many of the other configuration knobs in the buck2_re_client
section, but could also be used for things like in this PR, as well as maybe simplifying other things.
Concretely, the benefits I'm imagining for this are:
Is this something that you would be interested in either using or possibly also doing a bit of refactoring to implement it?
If so, I will need to raise this internally and make sure that the whole team is onboard with it, but I don't expect it to be too controversial.
As to this PR, my concern with it is basically that while I totally believe that this is useful and important for you and your users, I have basically no confidence that this won't contain anything from redundant to actively misleading information for other backends, so I don't know that I can just take it as-is. If you wanted to unblock yourself quickly we could just put this behind another configuration option, but that felt a bit silly hence the suggestion above.
Other than that, I think this ideally would include not only the little bit of extra text at the end, but the "action digest" blurb would be a part of this configuration too. So specifically, I'm imaging that somewhere we'd end up with a match statement that looks like
match profile {
Profile::Meta => format!("To reproduce: `frecli cas download-action {}`", digest),
Profile::Default => format!("Action digest: `{}`", digest),
Profile::EngFlow => error.message.clone(), // Or whatever else you wanted
}
As to this PR, my concern with it is basically that while I totally believe that this is useful and important for you and your users, I have basically no confidence that this won't contain anything from redundant to actively misleading information for other backends, so I don't know that I can just take it as-is. If you wanted to unblock yourself quickly we could just put this behind another configuration option, but that felt a bit silly hence the suggestion above.
I can help you figure this out since luckily I work on a RE backend (EngFlow) :) The whole point of the field is to display additional diagnostics from a RE backend on error or on demand (this part is currently missing as it requires some thinking in terms of UX) https://github.com/bazelbuild/remote-apis/blob/main/build/bazel/remote/execution/v2/remote_execution.proto#L1500 . This would work regardless of the RE implementation as per protocol it is how it is intended to be used (though, I admit, this isn't very clearly conveyed by the field's name).
As for the profiles, I fear that introduces logic that shouldn't live in the build system, but rather in the service. The protocol is already designed to support a wide level of configurability and I don't really see any specific use cases that would benefit from such a behaviour. Even this PR, it really is just a step towards following the more obscure aspects of the RE-APIs, those that a protobuf file hides away and that require reading the docs in depth (or having experience with a RE backend). Wdyt?
I can help you figure this out since luckily I work on a RE backend (EngFlow) :)
Yes, I know, that's why I said what I did :)
The whole point of the field is to display additional diagnostics from a RE backend on error or on demand (this part is currently missing as it requires some thinking in terms of UX) https://github.com/bazelbuild/remote-apis/blob/main/build/bazel/remote/execution/v2/remote_execution.proto#L1500 . This would work regardless of the RE implementation as per protocol it is how it is intended to be used (though, I admit, this isn't very clearly conveyed by the field's name).
Indeed, but also, I don't find the documentation on the field to be much more clear either. Out of curiosity, do you know what bazel does?
Even this PR, it really is just a step towards following the more obscure aspects of the RE-APIs, those that a protobuf file hides away and that require reading the docs in depth (or having experience with a RE backend)
I had looked for some more information on this field but didn't find any, although probably I don't know where to look. Do you have a link?
In any case, if you don't want to go the configurable route, I think I am going to have to ask you to actually test this PR against other RE backends and share some snippets of what the resulting effect on the error message is. Without that I basically just don't know what this change does.
I could test it on other RE backends but I don't really see the point. The goal of that field is to contain any additional debugging information a RE backend wishes to share with a client in case of a failure and as such it'd work exactly the same way with any other RE backend. This is also how Bazel uses it. If you think it could help I wouldn't mind adding a configuration flag to disable it though. Having said so, this really isn't anything EngFlow specific, we're just following the spec here :)
@TheGrizzlyDev the bar for including this kind of thing in the output isn't that the information just needs to be vaguely relevant. It needs to actually provide sufficient additional value on top of the existing error message to justify taking up additional screen space and making the error message more verbose. Specifically for action errors I think we're particularly sensitive to this, because we get it wrong already - our action errors are too verbose and hard to read and really we should be looking to reduce the information present in them.
Now if you're going to use this to show a link to some debugging UI, that seems like it might meet this bar and be a good thing to include in the output. But if other RE backends are only going to say something like "command failed with exit code 1" or repeat the action digest or something like that, that's not something we should just dump onto the user.
@TheGrizzlyDev the bar for including this kind of thing in the output isn't that the information just needs to be vaguely relevant. It needs to actually provide sufficient additional value on top of the existing error message to justify taking up additional screen space and making the error message more verbose. Specifically for action errors I think we're particularly sensitive to this, because we get it wrong already - our action errors are too verbose and hard to read and really we should be looking to reduce the information present in them.
Now if you're going to use this to show a link to some debugging UI, that seems like it might meet this bar and be a good thing to include in the output. But if other RE backends are only going to say something like "command failed with exit code 1" or repeat the action digest or something like that, that's not something we should just dump onto the user.
Well, when the field is present then by definition it adds relevant information for debugging. Though there aren't any restrictions for the contents, you shouldn't expect anything like the message you mentioned either. In our case we use to share a link that is very useful for debugging. As far as I can tell other backends either do not implement it or use it to propagate similarly useful information (buildbarn does the same thing https://github.com/buildbarn/bb-remote-execution/blob/b669c2edbdf3661ea1bf39126a2d2d01f81ab161/pkg/builder/caching_build_executor.go#L62 )
Ping :)
Ping @JakobDegen