Open AkihiroSuda opened 2 years ago
I think nerdctl image -a or -o wide should show blobsize,at default it is not necessary to show it.
benchmark
Only Unpacked Size
:
ntimes -n 10 _output/nerdctl images
REPOSITORY TAG IMAGE ID CREATED PLATFORM SIZE BLOB SIZE
alpine latest 21a3deaa0d32 24 hours ago linux/amd64 5.9 MiB 0.0 B
debian latest fb45fd4e25ab 14 minutes ago linux/amd64 134.5 MiB 0.0 B
ubuntu latest 669e010b58ba 13 minutes ago linux/amd64 77.9 MiB 0.0 B
real average: 164.16197ms, max: 185.0302ms, min: 137.0472ms, std dev: 16.171173ms
real 99 percentile: 185.0302ms, 95 percentile: 185.0302ms, 50 percentile: 165.57285ms
user average: 54.0097ms, max: 64.725ms, min: 41.953ms, std dev: 7.904296ms
sys average: 99.1182ms, max: 119.68ms, min: 79.722ms, std dev: 13.591908ms
flaky: 0%
Only Blob Size
:
REPOSITORY TAG IMAGE ID CREATED PLATFORM SIZE BLOB SIZE
alpine latest 21a3deaa0d32 24 hours ago linux/amd64 0.0 B 2.7 MiB
debian latest fb45fd4e25ab 17 minutes ago linux/amd64 0.0 B 52.4 MiB
ubuntu latest 669e010b58ba 16 minutes ago linux/amd64 0.0 B 27.2 MiB
real average: 157.88058ms, max: 236.3843ms, min: 142.0371ms, std dev: 28.001696ms
real 99 percentile: 236.3843ms, 95 percentile: 236.3843ms, 50 percentile: 149.58565ms
user average: 53.0145ms, max: 79.649ms, min: 43.332ms, std dev: 11.046308ms
sys average: 86.4968ms, max: 100.921ms, min: 72.189ms, std dev: 8.907955ms
flaky: 0%
Both :
REPOSITORY TAG IMAGE ID CREATED PLATFORM SIZE BLOB SIZE
alpine latest 21a3deaa0d32 24 hours ago linux/amd64 5.9 MiB 2.7 MiB
debian latest fb45fd4e25ab 20 minutes ago linux/amd64 134.5 MiB 52.4 MiB
ubuntu latest 669e010b58ba 19 minutes ago linux/amd64 77.9 MiB 27.2 MiB
real average: 184.36679ms, max: 217.7352ms, min: 170.1355ms, std dev: 15.003351ms
real 99 percentile: 217.7352ms, 95 percentile: 217.7352ms, 50 percentile: 174.8397ms
user average: 65.1658ms, max: 82.53ms, min: 52.764ms, std dev: 8.821194ms
sys average: 106.5241ms, max: 112.316ms, min: 92.911ms, std dev: 5.815795ms
flaky: 0%
the difference between the 3 average is negligible I think It is useless to make a --quick
flag
Thanks, but we need to benchmark with more than 100 images, I guess
@AkihiroSuda I agree. But 100 is a huge number. Do you have a suggestion to get this number of image in one environnement 😅
I pulled a list of images from some of most pulled using docker hub API:
curl -s "https://hub.docker.com/v2/repositories/library/?page=1&page_size=100" | jq '.results | .[] | .name'
some of them are DEPRECATED (i.e SCRATCH). So in total, I have 96 images in my environment.
I did @fahedouch experiment: cc: @AkihiroSuda
BOTH
REPOSITORY TAG IMAGE ID CREATED PLATFORM SIZE BLOB SIZE
ubuntu-debootstrap latest e74053a4261a 8 hours ago linux/amd64 101.2 MiB 33.3 MiB
......
wordpress latest 999392cfea3c 8 hours ago linux/amd64 624.8 MiB 204.7 MiB
real average: 3.26550947s, max: 3.293986785s, min: 3.239568373s, std dev: 17.687189ms
real 99 percentile: 3.293986785s, 95 percentile: 3.293986785s, 50 percentile: 3.26016171s
user average: 2.3691691s, max: 2.431819s, min: 2.318302s, std dev: 38.575007ms
sys average: 1.3688926s, max: 1.444045s, min: 1.293038s, std dev: 49.071786ms
Without BlobSize
( comment image.Size(ctx)
)
REPOSITORY TAG IMAGE ID CREATED PLATFORM SIZE BLOB SIZE
java latest c1ff613e8ba2 8 hours ago linux/amd64 662.7 MiB 0.0 B
...
wordpress latest 999392cfea3c 8 hours ago linux/amd64 624.8 MiB 0.0 B
real average: 2.986185244s, max: 3.01026741s, min: 2.964963446s, std dev: 12.432606ms
real 99 percentile: 3.01026741s, 95 percentile: 3.01026741s, 50 percentile: 2.983283399s
user average: 2.1341679s, max: 2.221924s, min: 2.033311s, std dev: 56.500939ms
sys average: 1.2875114s, max: 1.377556s, min: 1.228951s, std dev: 44.839664ms
Without Unpacked Size
( comment unpackedImageSize(ctx, x.snapshotter, image)
)
REPOSITORY TAG IMAGE ID CREATED PLATFORM SIZE BLOB SIZE
jenkins/jenkins latest 1f6e7ef75a54 18 minutes ago linux/amd64 0.0 B 276.3 MiB
...
wordpress latest 999392cfea3c 8 hours ago linux/amd64 0.0 B 204.7 MiB
real average: 2.079441195s, max: 2.100727932s, min: 1.989394178s, std dev: 32.790733ms
real 99 percentile: 2.100727932s, 95 percentile: 2.100727932s, 50 percentile: 2.083640085s
user average: 1.5221273s, max: 1.624942s, min: 1.371136s, std dev: 68.459596ms
sys average: 863.0735ms, max: 908.323ms, min: 797.141ms, std dev: 38.225147ms
Without both ( comment both)
REPOSITORY TAG IMAGE ID CREATED PLATFORM SIZE BLOB SIZE
neo4j latest 8ba4306cccb0 44 minutes ago linux/amd64 0.0 B 0.0 B
...
percona latest 8e77cd4bdbed 8 hours ago linux/amd64 0.0 B 0.0 B
real average: 1.808876959s, max: 1.837062398s, min: 1.793315846s, std dev: 15.067907ms
real 99 percentile: 1.837062398s, 95 percentile: 1.837062398s, 50 percentile: 1.80513215s
user average: 1.3167565s, max: 1.369457s, min: 1.242427s, std dev: 44.493909ms
sys average: 749.8885ms, max: 855.136ms, min: 661.094ms, std dev: 62.384582ms
seems unpackedImageSize
cost a lot of time?
About 44% faster if we skip size calculation.
nerdctl version >= 0.22.0
Reviving this issue, and sharing notes here:
I was working on adding caching to the snapshotter for commands like images
(so that we do not make several Stat
and Usage
calls for the same objects), and started looking into image list overall.
Our current implementation of image list
first calls on images.List
.
Then, for each image, we call image.Platforms()
, which:
Then, for each image and each architecture, we call on images.Check
, which:
ReaderAt
on every "required" resource (eg: any linked manifest)ReaderAt
does:
Then we call containerd.NewImageWithPlatform
, then Config()
on the resulting image, which will read the config blob again.
Then we call RootFS
for the size calculation, which again will read the config blob (twice).
Then we call image.Size
for the blob size calculation, which will again read the blob of every children of the target.
That is a lot of duplicate filesystem calls.
A quick auditd test confirms the above and reports about 60 filesystem hits for Alpine here (alpine contains one layer and 8 architectures). Whether or not this is entirely accurate, or my reading of the code is 100% right, this is still a lot of filesystem access for one single image.
Suggestion:
That should bring down the number of filesystem access significantly.
Here are some highly not scientific numbers against main and against my current work in progress:
For 500 identical, single platform images (tagged from apostasie/test
), time will give:
For 500 identical, multi platform images (all tagged from alpine:latest
, just the current platform):
For 500 identical, multi truly platform images (all tagged from alpine:latest
, --all-platform):
Using the image list from above, with code instrumentation, printImage execution time gets divided by 3 (from about 1 second to 0.3 second), although time
only shows about 20% improvement overall.
The PR is not ready to be merged and more a proof of concept, but would like to know what you folks think.
Cheers.
On today's main (873a08791f570bafb91dc10f99e2b9bf45c7e6f9), with the list of images provided above (84 images currently), on a lima setup / M1 Max:
real average: 408.938725ms, max: 430.797757ms, min: 391.571417ms, std dev: 14.795711ms
real 99 percentile: 430.797757ms, 95 percentile: 430.797757ms, 50 percentile: 398.672759ms
user average: 167.9466ms, max: 201.888ms, min: 133.506ms, std dev: 25.245575ms
sys average: 151.5746ms, max: 173.962ms, min: 110.628ms, std dev: 22.000527ms
flaky: 0%
Originally posted by @AkihiroSuda in https://github.com/containerd/nerdctl/issues/789#issuecomment-1038903152