oxidecomputer / crucible

A storage service.
Mozilla Public License 2.0
170 stars 18 forks source link

Make bulk write `MAX_CHUNK_SIZE` bigger? #1020

Open david-crespo opened 11 months ago

david-crespo commented 11 months ago

Currently we upload images through the API in 512 KiB chunks. That is quite small. My understanding is that constraint comes from here, and comment https://github.com/oxidecomputer/crucible/pull/489#discussion_r1010600221 when it was added says it's arbitrary. I suspect image (and soon TUF repo) uploads through the console could go faster if this was bigger.

https://github.com/oxidecomputer/crucible/blob/3927d8da246f77bc9bd70559b82646317d6541e5/pantry/src/pantry.rs#L91

https://github.com/oxidecomputer/crucible/blob/3927d8da246f77bc9bd70559b82646317d6541e5/pantry/src/pantry.rs#L264-L279

jclulow commented 11 months ago

It's smallish, but not tiny. We picked this size at the time because it's what EBS Direct allows you to write in a single PUT request to a volume there. I don't think we want to make this too large, as it represents data we may need to buffer in memory or on disk so that its integrity can be verified prior to being pushed to the backing volume. Keeping individual requests relatively constrained also has benefits for rate limiting, monitoring, and resource controls on the server side.

To increase throughput, I believe you can issue multiple requests in parallel. That's also what people generally do with the somewhat similar EBS Direct facility as far as I can tell.

david-crespo commented 11 months ago

We make up to 6 requests in parallel on web as I believe that is the most browsers will let you do anyway (it's hard to pin down a clear answer on this but I've tested it manually and it seems right).

In addition to the downsides you mention, it may well be that increasing the chunk size would not speed things up much anyway. If time in transit is a small proportion of the total time, and most of the time is spent on the server processing the chunk, and that processing time is linear with the chunk size, then increasing chunk size won't do anything — the only way to speed things up would be to cut down the processing time.

jclulow commented 11 months ago

Is it possible to get statistics in the client for the request latency for each chunk?

david-crespo commented 11 months ago

Yeah, I will see about getting some stats.