containerd / accelerated-container-image

A production-ready remote container image format (overlaybd) and snapshotter based on block-device.
Apache License 2.0
405 stars 73 forks source link

Avoid potential connectivity related layer corruption in userspace convertor #289

Closed estebanreyl closed 3 months ago

estebanreyl commented 3 months ago

Avoid potential connectivity related layer corruption in userspace convertor

Add additional checks to prevent deduplicated layer commit files from having partial downloads leading to corrupted images. Signed-off-by: Esteban esrey@microsoft.com

What this PR does / why we need it: This PR aims to remove an issue with the user space convertor accepting partial deduplicated commit files in the event that the puller suffers from connectivity issues. I encountered this situation specifically with a custom resolver, fetcher, etc but it should repro with the docker ones, either way this hardens that path and prevents any such issues from happening. The main goal is to verify the deduplicated commit layer before upload. Note that such failures would not necessarily lead to a corrupted image in the past, this would only happen if the layer in question is not required in a later commit step, if it is this would lead to an issue being thrown in a subsequent overlaybd-apply. This PR adds recoverability from that situation or allows for explicit failure avoiding corruption in case of running into the issue. The PR also adds general digest verification checks for the layer downloads.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged): Fixes #

Please check the following list:

WaberZhuang commented 3 months ago

Thanks for the pr, it's true that we need to do this because the fetcher is not responsible for validation.

I suggest to validate the layer (deduplicated or not) after the download, or is there something wrong with this way?

estebanreyl commented 3 months ago

Thanks for the pr, it's true that we need to do this because the fetcher is not responsible for validation.

I suggest to validate the layer (deduplicated or not) after the download, or is there something wrong with this way?

This is a great suggestion; at the moment I only validate the dedup layer on the upload since failures on the other scenario led to a complete failure but as you mention the fetcher is not technically responsible for verification so its probably best to guard for the regular layer case too. I'll add a digest validation check on downloadLayer which will account for that additional check.