containerd / accelerated-container-image

A production-ready remote container image format (overlaybd) and snapshotter based on block-device.
Apache License 2.0
409 stars 75 forks source link

Fast pull the full images in parallel without lazy loading #195

Open shuaichang opened 1 year ago

shuaichang commented 1 year ago

What is the version of your Accelerated Container Image

No response

What would you like to be added?

Overlaydb is great at accelerating container image pulling and we've enjoyed the benefit and appreciate all the support from the community!

Why is this needed for Accelerated Container Image?

Problems

The ondemand data transfer and trace based prefetch are great tools, however, we do see another gap that can be filled in between: fast prefetch of all blobs.

The following are the reasons:

  1. For some applications, lazy load would change application behavior. One example would be K8s workload with startup/liveness/ready probes, that before doing lazy pulling, they can start up with no issue. After onboarding to lazy pull, they fail to start as the previous probe period is not long enough. This makes some application hard to adopt OverlayBD without changing config.
  2. It's not easy to observe the overall latency of image pull as the time has been attributed to application startup time. It also introduced new failure mode that previously we won't run application unless image pull is successful. With lazy pull, it could result in runtime IO hang or other errors hard to debug (this is especially hard for different teams owning application and the container/image runtime infra)
  3. Download full blobs can also be fast, only decompression is slow. Given OverlayBD images decompression is super fast. With high concurrency, we were able to saturate the VM bandwidth and download a multi-GB OverlayBD images in several seconds.

I am aware that the trace based prefetch would make this issue much better, but it can be costly to add the trace record CI/CD build system in a large scaled Infra with many dependencies.

Therefore, I feel if OverlayBD has a feature that is between lazy loading and trace based prefetch (let's just call it Prefetch), then it will be a perfect solution without require too much learning curve and courage to adopt (Problem 2 is a pretty big mindset shift that can slow down adoption)

Options

We propose some options here, please feel free to also add

Please feel free to also contribute ideas. Again, we appreciate all the great work from OverlayBD community. By contributing real world use cases and requirement, hopefully, we can also help drive OverlayBD adoptions.

Thanks!

Are you willing to submit PRs to contribute to this feature?

liulanzheng commented 1 year ago

@shuaichang I think in general, there are two implementation paths:

  1. in containerd: use rpull --download-blobs for pulling overlaybd images but two improvements needed to be done. one is parallel downloading single block in chunks to speed on download speed. the other is to remove the process of untar/decompression from content store to snapshot.
  2. in overlaybd: use cache type of download, and full speed background download with no delay or full speed prefetch. and also i think external_image_puller is feasible for cache type download that external_image_puller downloads blobs and write into corresponding snapshots directory.