systemd / casync

Content-Addressable Data Synchronization Tool
1.51k stars 117 forks source link

Are files always extracted even if not modified in target ? #264

Open agambier opened 2 years ago

agambier commented 2 years ago

I don't know if there is a forum for this kind of question.

On an embedded device.I use the command below to update the secondary partition of the rootfs.

casync extract --verbose --seed=/ --store=http://10.0.10.151:8080/rootfs.castr http://10.0.10.151:8080/rootfs.caidx /var/tmp/rootfs-b

According to the information outputted by casync I can see that after seeding the files, casync extract them.

...
Extracting usr/share/terminfo/s/screen
Extracted usr/share/terminfo/s/screen
Extracting usr/share/terminfo/s/screen-256color
Extracted usr/share/terminfo/s/screen-256color
Extracted usr/share/terminfo/s
Extracting usr/share/terminfo/v
Extracting usr/share/terminfo/v/vt100
Extracted usr/share/terminfo/v/vt100
Extracting usr/share/terminfo/v/vt100-putty
Extracted usr/share/terminfo/v/vt100-putty
Extracting usr/share/terminfo/v/vt102
Extracted usr/share/terminfo/v/vt102
Extracting usr/share/terminfo/v/vt200
...

Are the files really extracted even if they have not been modified on the rootfs ? Is it just a confusing debug message ?

polarathene commented 3 weeks ago

With local filesystem only, you can casync make --without=all / casync extract and you'll notice regardless of any seed options it seems that files will get their timestamp updated (mtime attribute), even when no changes in content exist.

$ docker run --rm -it ubuntu:24.04
$ apt update && apt install -y casync
$ cd /tmp &&mkdir -p src && touch src/file

# Avoid storing all extra metadata including mtime
$ casync make example.caidx --without=all src
$ casync extract example.caidx dest

$ ls -la dest
total 0
-rw-r--r-- 1 root root    0 Sep 11 09:00 file

# Wait a minute and try again:
$ casync extract example.caidx dest
$ ls -la dest
total 0
-rw-r--r-- 1 root root    0 Sep 11 09:01 file

The file was clearly modified even though there should be no diff from source or dest, the mtime was updated.


I was curious about using this with Dockerfile to only copy over changes (as file modifications as small as a chown introduce full copies, not just the delta), thus this behaviour with casync if not possible to opt-out is not ideal (there's a similar issue with Rust which relies on mtime for cache).

Even with the default attributes, due to this behaviour it also replaces the file itself resulting in a change in inode associated to it. Something that should ideally be avoided when there's no actual change to apply.


Fedora no longer seems to offer the package as it fails to build there. Looking at the repo history it also seems like not much development goes on with casync, README hasn't been touched for years with the verify and digest commands marked as experimental.

There's this closed issue with a maintainer stating in 2022 the project is not dead when closing the issue, but even if that were disputed today one might agree that the project isn't exactly active/healthy?