buildbarn / bb-storage

Storage daemon, capable of storing data for the Remote Execution protocol
Apache License 2.0
137 stars 91 forks source link

Support for compression from buildbarn #158

Open eagleonhill opened 1 year ago

eagleonhill commented 1 year ago

Bazel remote protocol now have compression option. That would be very useful for cases where the bazel and build-barn is in different clusters, and also saves storage usage.

EdSchouten commented 1 year ago

Hey @eagleonhill,

When compression support was added to the Remote Execution protocol, I was opposed to adding it in its given form. The reason being that the transmitting side does not properly send the size of the compressed object to the receiving side. With the way LocalBlobAccess is designed, it would be pretty hard to allocate space for objects as they are being written, as it tries to store objects consecutively. This means that compression would either be limited to just on wire (not at rest), or bb-storage would need to do unnecessary copying of data to guarantee consecutive storage.

Furthermore, one downside of compression is that it's a lot harder to provide fast random access to blobs. As Buildbarn relies heavily on virtual file systems (based on FUSE/NFSv4), this is something that we need. Even without compression the protocol's story around random access is a bit messy, as there's no way to guarantee data integrity by reading CAS objects partially.

Because of that, I am currently spending time with the remote API working group to get chunking/decomposition of CAS objects added. See these PRs:

Once we have proper support for chunking/decomposition, the story around compression will become a lot simpler. All (primitive) CAS objects will become bounded in size. This means that even if the receiving side doesn't know the size of a compressed CAS object, it can safely ingest it into memory to compute its size, before letting LocalBlobAccess write it to disk.

So in summary, the reason Buildbarn doesn't support compression is not because we don't care. We want to do it right, and that still takes a bit more time to sort out. I will keep this issue open, so that people can remain informed on updates.

bytearth commented 1 year ago

Hey @EdSchouten compression is also a very useful feature for us. I have read the code of bazel-remote, in their decision, when receive bytestream.Write request with resource path like {instance_name}/uploads/{uuid}/compressed-blobs/{compressor}/{uncompressed_hash}/{uncompressed_size}{/optional_metadata}, they will decode the compressed data and write to the file if use diskCache backend. zstd supports ”streaming decompression“, so they can decode and write in a loop, like this:

for {
    compressedData = src.Recv()
    uncompressedData = zstd.Decode(compressedData)
    dst.write(uncompressedData)
}

maybe buildbarn can have a try on it. when receive WriteRequest, Allocator a space with uncompressed_hash, then receive, decode, and write to the target Block.

EdSchouten commented 1 year ago

That’s the point: if we add compression support, I don’t just want compression in transit. Also at rest. And that’s problematic, as we can’t allocate space for something we don’t know the size of.

jmmv commented 1 year ago

From my reading of this bug, I still don't understand why supporting gRPC-level in-transit compression (https://github.com/grpc/grpc-go/blob/master/Documentation/compression.md) is not desired/allowed. This feature should be easy to enable and it is completely invisible to the application-level logic, correct?

The reason I ask is because we face the situation where some of our artifacts are too large to download in a reasonable amount of time. However, these artifacts can be compressed very well and they almost-never change (so they stay in the cache for a long time)—so we don't really care about on-disk compression. Having the ability to compress in-transit only would solve 90% of our problems with slow builds due to downloads, and I think this should be a fairly simple change?

More specifically, what we'd be looking at is having compression between the frontend and bb-clientd, and nothing else. Bazel nor the storage nodes wouldn't even have to know that the compression is happening.

kjteske commented 1 year ago

+1 to gRPC-level in-transit compression.

Our biggest artifacts (~1GB) are large binaries with debug information that we know compress really well: adding "-gz=zlib" to our gcc options shrinks these by 60% or more. Unfortunately "-gz=zlib" adds 40+ seconds to the link time, which is a dealbreaker. We're optimistic gRPC compression would perform better.

(gcc 13 adds a "-gz=zstd" option which might be better, but we're on gcc 12.3 for now).