containerd / overlaybd

Overlaybd: a block based remote image format. The storage backend of containerd/accelerated-container-image.
Apache License 2.0
260 stars 58 forks source link

请问使用overlaybd后,层缓存还会生效吗 #68

Closed bengbeng-pp closed 3 years ago

bengbeng-pp commented 3 years ago

层缓存指的是:不同镜像中某些层内容一样,在机器上共用该层,只需要下载一次

问题:将这些镜像转换为DADI镜像,相同内容的层还可以共用本地缓存吗

e.g: 镜像A 层包括: a b c 镜像B 层包括:c d f 镜像A和镜像B层sha256内容一致

镜像A和镜像B转换加速镜像后

在同一台机器上,加速镜像A下载完成后,下载加速镜像B时,还需要下载层c吗

bengbeng-pp commented 3 years ago

追问一个问题 镜像A 层 a b c d e f 镜像B 层 a b c X Y Z 镜像A和镜像B前3层digest是一样的,都转为加速镜像后 加速镜像A和加速镜像B,前3层digest不一致 请问是什么原因

BigVan commented 3 years ago

Hi, we provided the command option 'obdconv' which is used to do image conversion from an OCI image(tgz). For each layer, it just unpack the tgz data to overlaybd. However, we can't decide the data layout of block-device level where the image is converted multiple times. But they actually have same data in filesystem view. So you don't need worry about the difference in digest after the same layer's conversion.

For the above reason, image cache can't be reused when the same layer converts twice.

bengbeng-pp commented 3 years ago

Why can't decide the data layout of block-device level where the image is converted multiple times.
This makes the image layer pre-distribution function unavailable.

Is there an optimization method and plan?

BigVan commented 3 years ago

Why can't decide the data layout of block-device level where the image is converted multiple times.

The data layout in block-device is controlled by linux kernel.... Furthermore, the metadata (eg. creation time/access time) of image content will also be modified on converting.

To make image cache available, we need to record the relationship between tgz layer and overlaybd layer. That makes a possible to avoid converting a layer multiple times. In order to do this, a KV store might be used. And we will make a plan to do it.

BTW, The global deduplication of image conversion has been released on alibaba cloud(ACR-ee).

bengbeng-pp commented 3 years ago

Thanks for answering There is another problem. The local cache file path is XXX/repo/blob-digest, which results in that only files under the same repo can share the cache. The local cache refers to "/opt/overlaybd/registry_cache".

What is the original intention of this design? Are there any plans to share local caches under different repo?

BigVan commented 3 years ago

only files under the same repo can share the cache.

You are right.😂 Of course, sharing the local cache is necessary, I will try to do it. This repo has been open source for just several months .... There's still lots of room for improvement, thank u very much.

bengbeng-pp commented 3 years ago

Thanks for answering

As a temporary solution, there is an idea

Is it possible to specify basepath, so as to avoid repeated layer conversion

How to make basepath, can I convert DADI mirror into basepath file?

image

BigVan commented 3 years ago

Is it possible to specify basepath, so as to avoid repeated layer conversion

I don't think it is an elegant way to solve this problem.

You should open an issue in https://github.com/containerd/accelerated-container-image...