Open diegohavenstein opened 11 months ago
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).
View this failed invocation of the CLA check for more information.
For the most up to date status, view the checks section at the bottom of the pull request.
Though I approve of this change, I do think we're still leaving some things undefined that are worth specifying.
Notably: what's the maximum individual/combined size of Tree messages UpdateActionResult()/GetActionResult() is willing to tolerate to perform completeness checking on? Relatedly, what's the maximum size of a Directory message that may be contained in a Tree?
Buildbarn has the ability to limit these through its
maximumTotalTreeSizeBytes
andmaximumMessageSizeBytes
options.
(Sorry for my lack of deep knowledge in this area) Does it make sense to have fine-granular settings for these? Not sure how they exactly relate to max blob size in the CAS Are you referring to UpdateActionResult()/GetActionResult() in ActionCacheServer.java?
Are you referring to UpdateActionResult()/GetActionResult() in ActionCacheServer.java?
Is that ActionCacheServer.java in bazel-buildfarm? If so... yes?
GetActionResult() is only permitted to return ActionResults if all of the output files it references exist. ActionResults may also refer to output directories (Tree objects), which in its turn point to files. If max_cas_blob_size_bytes
is set to 100 TB, that still doesn't mean that it's okay to upload a 100 TB Tree object and provide it to a call to UpdateActionResult()
. There would be no way for a remote cache to validate it in a reasonable amount of time/space.
Are you referring to UpdateActionResult()/GetActionResult() in ActionCacheServer.java?
Is that ActionCacheServer.java in bazel-buildfarm? If so... yes?
GetActionResult() is only permitted to return ActionResults if all of the output files it references exist. ActionResults may also refer to output directories (Tree objects), which in its turn point to files. If
max_cas_blob_size_bytes
is set to 100 TB, that still doesn't mean that it's okay to upload a 100 TB Tree object and provide it to a call toUpdateActionResult()
. There would be no way for a remote cache to validate it in a reasonable amount of time/space.
I agree that there are more types of limits that could be specified here--total blobs in a tree, total tree depth, max files/subdirectories in a level, etc. However, I think this limit stands alone, provides reasonable value, and is easy to understand. I think committing this on its own is good, and we can have a separate conversation about whether specifying additional limits would be valuable.
Is it worth introducing a new ServerLimits/ServerConstraints message that contains various different limits to be introduced in the future?
Is it worth introducing a new ServerLimits/ServerConstraints message that contains various different limits to be introduced in the future?
That sounds like a good idea to keep CacheCapabilities stable, yes. I'll push a patch
Is it worth introducing a new ServerLimits/ServerConstraints message that contains various different limits to be introduced in the future?
I had a look again, I don't think it's good in this context, because many things in the CacheCapabilities could arguably be in the new proposed message (e.g. max_batch_total_size_bytes or supported_compressors), since they describe limits/constraints. As such I think the max_cas_blob_size_bytes is in the right place as of now.
Do you agree @sluongng ?
Sound reasonable to me. Thanks for investigating! 🤗
Are you referring to UpdateActionResult()/GetActionResult() in ActionCacheServer.java?
Is that ActionCacheServer.java in bazel-buildfarm? If so... yes? GetActionResult() is only permitted to return ActionResults if all of the output files it references exist. ActionResults may also refer to output directories (Tree objects), which in its turn point to files. If
max_cas_blob_size_bytes
is set to 100 TB, that still doesn't mean that it's okay to upload a 100 TB Tree object and provide it to a call toUpdateActionResult()
. There would be no way for a remote cache to validate it in a reasonable amount of time/space.I agree that there are more types of limits that could be specified here--total blobs in a tree, total tree depth, max files/subdirectories in a level, etc. However, I think this limit stands alone, provides reasonable value, and is easy to understand. I think committing this on its own is good, and we can have a separate conversation about whether specifying additional limits would be valuable.
I'm good to ship this as-is. What is the process? I can't merge commits here it seems @bergsieker
We need this field to improve the performance of Bazel when using remote cache, since by doing this we can determine the CAS entry upper limit size, thus skipping uploads that would fail due to size constraints.
The issue on Bazel side is https://github.com/bazelbuild/bazel/issues/18449