containerd / accelerated-container-image

A production-ready remote container image format (overlaybd) and snapshotter based on block-device.
Apache License 2.0
411 stars 76 forks source link

How to prefech data using snapshotter as block device without mount? #112

Closed xxinran closed 2 years ago

xxinran commented 2 years ago

Now the prefetching is based on a trace file on the top layer of an image. What if we use a block device without any fs mounted. Is there any possibility to prefetch the data?

beef9999 commented 2 years ago

Yes.


Google's stargz image format also has a prefetch function. Its principle is to mark specific files in the image header, and these files will be pulled first when the container is running.

As shown in the figure is the header of the stargz file format, and the landmark is used to distinguish the previous prioritized files from the other content behind. The prioritized files indicate the absolute path of the preferentially pulled files, and multiple chunks corresponding to these files can be found in the TOC (file entry) at the end.

stargz provides a snapshotter for containerd and expands the client ctr command, including a ctr-remote images optimize command. It starts a sandbox environment and runs container images in order to determine which are prioritized files.

Trace is a time-based prefetching technology developed by DADI. It first records the I/O information during the container restart process, and then saves it into the image metadata, and replays these I/O records at the next startup. In order to achieve prefetch cache, optimize the startup speed.

The DADI format stacks all the contents in the layered image of the container into a block device. At the OS level, the user reads the DADI image on the block device, but after LSMT conversion, the trace prefetch records at the image layer level.

If the format of trace is defined as a combination of frames according to the time series, the format of a frame is as follows:

struct {
int layer_index; // layer number
int offset; // offset of this I/O in the layer file
int count; // the number of bytes read by this I/O
}

The biggest difference between DADI trace prefetch and stargz prioritized files prefetch is that DADI is based on block devices, so the granularity of recording is finer, and there is no need to pull down the entire file; it is time-based, more efficient, saves bandwidth, and avoids bursts .

Finally, since DADI has its own metadata, the trace does not need to be combined with the real data of the mirror like stargz, but is placed in the metadata.

xxinran commented 2 years ago

Thanks for your reply. I tried to set overlaybd-snapshotter with label LabelSupportReadWriteMode: "dev" to use block device directly, and then rpull an image with acceleration layer, but it occurs

ctr: failed to commit snapshot extract-199052258-yDl8 sha256:fd84eb22532fcbe372941242bb3ebc762860b26392aaad92fd3b4342a887a66c: failed to commit writable overl
aybd: failed to open file '/var/lib/overlaybd/snapshots/524/block/writable_data', 2: No such file or directory

so I wonder if the "dev" mode supports prefetching, or did I misconfigure something?

beef9999 commented 2 years ago

Look like I have to withdraw my answer and replace it with a 'Currently not supported' ... @BigVan

xxinran commented 2 years ago

I wonder if there is any plan to support this usage, or is there any reason that we couldn't support it?

xxinran commented 2 years ago

closed in #116