flokli / nix-casync

A more efficient way to store and substitute Nix store paths
80 stars 4 forks source link

Crazy idea: Use reflinks to construct files from chunks #38

Closed Atemu closed 2 years ago

Atemu commented 2 years ago

I'm not 100% sure this is a good idea but it's a cool idea for sure:

Use copy_file_range() to construct the files in the to-be substituted store paths out of existing files in the Nix store that share chunks. CoW filesystems such as btrfs then simply add some pointers in the internal data structures to make the new file reference the same data as the old one which is faster and more space efficient than copying the chunks.

For example, /nix/store/aa...-foo contains a chunk from 1M-2M that you need in your substituted file at 0M-1M. Since this file is already present, you can use copy_file_range() to clone that range into the new file with the help of the filesystem.

This way substituted files would stay deduplicated at the filesystem level.

I'm pretty sure I saw someone working on storing zstd-compressed files as transparently decompressable files directly into a btrfs but I'm not sure that has landed yet. Making use of that would be the efficiency cherry on top.

flokli commented 2 years ago

This is a nice thought. But nix-casync doesn't keep an inventory of the local Nix store.

I didn't really plan to have it access /nix/store directly, either.

I thought about using it to provide /nix/store itself, as a fuse filesystem, if you want to run nix-built stuff on a system that doesn't necessarily needs to run nix evaluations and builds itself, but I don't really feel like it should poke inside files owned by other processes (Nix itself).

This could be an optimization if that type of substitution ends up in Nix itsef, and if we had all the data there, but I don't think it should belong into nix-casync itself.