facebook / buck2

Build system, successor to Buck
https://buck2.build/
Apache License 2.0
3.38k stars 201 forks source link

RE: Error, message length too large for `BatchReadBlobs` #583

Open avdv opened 4 months ago

avdv commented 4 months ago

We are using the bazel-remote-worker locally for isolation.

For a target we see this error reproducibly:

[2024-02-13T10:34:08.702+00:00] Action failed: root//frontend:app (prelude//platforms:default#213ed1b7ab869379) (genrule)
[2024-02-13T10:34:08.702+00:00] Internal error (stage: materialize_outputs): action_digest=9f86841c1d695a3dbf441d2d9c984a6c750f2ccbd6b0e59de286f42e5ff060ab:142: Failed to declare in materializer: Failed to make BatchReadBlobs request: status: OutOfRange, message: "Error, message length too large: found 4194593 bytes, the limit is: 4194304 bytes", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc", "grpc-encoding": "identity", "grpc-accept-encoding": "gzip"} }

I found this post, that pin pointed the problem. This error is actually thrown by tonic, here.

The bazel-remote-worker reports a max batch size of 4MiB in its capabilities. If a request's batch size is getting close to this limit, the server response (including headers and maybe compression meta data) might exceed the 4MiB default transport message size enforced by tonic and cause this error.

As a workaround we have set the max batch size to 4MB for the bazel-remote-worker.

The max size enforced by tonic is also configurable and probably should be set to a value larger than the max batch size reported by the server.

aherrmann commented 4 months ago

Adding emphasis on the difference in numbers, since it's subtle and may be hard to spot:

The bazel-remote-worker reports a max batch size of 4MiB (4*1024*1024) in its capabilities. As a workaround we have set the max batch size to 4MB (4*1000*1000) for the bazel-remote-worker.