lima-vm / lima

Linux virtual machines, with a focus on running containers
https://lima-vm.io/
Apache License 2.0
15.45k stars 607 forks source link

Digest-less images always re-download after last-modified time changes on server #2902

Open nirs opened 1 week ago

nirs commented 1 week ago

Description

When downloading cached image, if the last modified time from the server is different from the cached timed, we download the image again (good), but the cached time is not updated, so we download the image again for every new instance.

Example:

% limactl create --tty=0 test.yaml
...
INFO[0000] Re-downloading digest-less image: last-modified mismatch (cached: "Wed, 09 Oct 2024 00:31:27 GMT", remote: "Thu, 07 Nov 2024 14:31:23 GMT") 
Downloading the image (ubuntu-24.04-server-cloudimg-arm64.img)
571.32 MiB / 571.32 MiB [----------------------------------] 100.00% 60.94 MiB/s
...

% limactl create --tty=0 test.yaml --name test2
...
INFO[0000] Re-downloading digest-less image: last-modified mismatch (cached: "Wed, 09 Oct 2024 00:31:27 GMT", remote: "Thu, 07 Nov 2024 14:31:23 GMT") 
Downloading the image (ubuntu-24.04-server-cloudimg-arm64.img)
571.32 MiB / 571.32 MiB [----------------------------------] 100.00% 50.72 MiB/s

% limactl create --tty=0 test.yaml --name test3
...
INFO[0000] Re-downloading digest-less image: last-modified mismatch (cached: "Wed, 09 Oct 2024 00:31:27 GMT", remote: "Thu, 07 Nov 2024 14:31:23 GMT") 
Downloading the image (ubuntu-24.04-server-cloudimg-arm64.img)
571.32 MiB / 571.32 MiB [----------------------------------] 100.00% 58.23 MiB/s

I think this is caused by the change to fix concurrent downloads - we store the new time once. If the file exists, we don't replace it. We probably need to replace the file when we know that the old time is stale.

Can be reproduced with:

images:
  - location: "https://cloud-images.ubuntu.com/releases/24.04/release/ubuntu-24.04-server-cloudimg-arm64.img"
    arch: "aarch64"
  - location: "https://cloud-images.ubuntu.com/releases/24.04/release/ubuntu-24.04-server-cloudimg-amd64.img"
    arch: "x86_64"
vmType: vz
plain: true

Workaround

Pruning the cache will fix this until the next last-modifed time change on the server:

limactl prune
afbjorklund commented 6 days ago

Previously any cached image would remain until deleted, might need a "imagePullPolicy" some day?

images:
# Try to use release-yyyyMMdd image if available. Note that release-yyyyMMdd will be removed after several months.
- location: "https://cloud-images.ubuntu.com/releases/24.10/release-20241023/ubuntu-24.10-server-cloudimg-amd64.img"
  arch: "x86_64"
  digest: "sha256:ee070d95a2ba5a1500264e75b3e14aa85518220c24d25f1535407c55f0e33e4d"
# Fallback to the latest release image.
# Hint: run `limactl prune` to invalidate the cache
- location: "https://cloud-images.ubuntu.com/releases/24.10/release/ubuntu-24.10-server-cloudimg-amd64.img"
  arch: "x86_64"

But if it does decide to download a new file, then the "timestamp" (file) should be updated as well...

nirs commented 6 days ago

But if it does decide to download a new file, then the "timestamp" (file) should be updated as well...

Indeed, fixed in #2903