s3gw-tech / s3gw

Container able to run on a Kubernetes cluster, providing S3-compatible endpoints to applications.
https://s3gw.tech
Apache License 2.0
125 stars 20 forks source link

rgw/sfs: End-to-end Checksums (Epic) #26

Open irq0 opened 2 years ago

irq0 commented 2 years ago

Having end to end checksums is nice during development and even nicer when the system encounters broken hardware.

A possible design would be to checksum every 1M, 4k, $whatevermakessense. Section the backend file, add headers with crc info. Or use a second, sparse file mapping offsets to headers. Or sqlite.

Check checksums on every operation that reads the data from disk (get, copy, etc).

Related: https://aws.amazon.com/blogs/aws/new-additional-checksum-algorithms-for-amazon-s3/

Tasks

jecluis commented 10 months ago

Seems reasonable. Maybe a candidate for v0.23.0; alternatively, we'll push this for GA.

jecluis commented 10 months ago

@irq0 how feasible is this for v0.23.0?

irq0 commented 10 months ago

It needs design work to be certain. From the top of my head I'd say better not. While it would increase robustness and confidence in the IO path, to do this right we need failure injection testing. An implementation also needs to be careful not to cause a performance regression.

Related issues that make sense to co design: https://github.com/aquarist-labs/s3gw/issues/669 - store checksum there as well or only there https://github.com/aquarist-labs/s3gw/issues/481 - use this checksum for versions

jecluis commented 10 months ago

Alright, lets reevaluate for v0.25.0. And lets add those two as tasks for this one.