Open Ten0 opened 6 months ago
After some testing, since lib.fileset
does remove empty directories, the following works:
src = lib.fileset.toSource {
root = src;
fileset = lib.fileset.fromSource (lib.cleanSourceWith {
inherit src;
filter = path: type: (
let baseName = baseNameOf (toString path);
in
(
(type == "directory" && baseName != "target" && baseName != "node_modules")
|| (baseName == "Cargo.toml" || baseName == "Cargo.lock" || baseName == "lib.rs" || baseName == "main.rs")
) && (lib.cleanSourceFilter path type) # + other basic filters
);
});
};
It's still not completely ideal because we actually don't care about the contents of the lib.rs
and main.rs
files, we only care about them being there or not. (They are used by cargo metadata to determine crates automatically when there are no [[lib]]
or [[bin]]
targets).
Another point to solve may be how bench/tests targets are inferred, so we may need to copy a bit more of the tree here as well. (Maybe we want to copy any .rs
file that's in a tests
folder but that on the other hand would be too much...)
Potential solution: two additional parameters, hasLibRs
and hasMainRs
. Then, pass in only the Cargo.lock
.
The
src
directory for IFD is not filtered: https://github.com/nix-community/crate2nix/blob/cf034861fdc4e091fc7c5f01d6c022dc46686cf1/tools.nix#L50This asks to copy the entire workspace tree to the nix store, performing no filtering. On large repositories containing build artifacts, this is impractical: it may sometimes represent hundreds of gigabytes, and causes the build to freeze forever while it's trying to copy those hundreds of gigabytes to the store, when all one really needs is all the
Cargo.toml
s andCargo.lock
.Fixing
EDIT: see updated solution below instead
It looks like this may be fixed by using
cleanSourceWith
:Although this still has the annoying property that the derivation hash will depend on a bunch of unnecessary empty directories (the hash shouldn't depend on any empty directory either), it makes it run reasonably, so it would still be a clear improvement.
Alternatively
lib.fileset
may be an option.Also cleaning up empty directories
This post seems to suggest that
lib.fileset
may by default already be filtering empty directories.Alternatively
According to this post, adding another layer to further recursively clean unnecessary empty directories:
may allow preserving the source hash regardless of whether new empty directories are created, but I'm not sure what prevents this from being source addressed so maybe I misunderstand something. Maybe that + another layer of IFD re-copying the cleaned source would fix it but that seems incredibly dirty.
The post also presents another option that does not have this issue, but which is obviously bad "because nested directories at depth N are redundantly processed N - 1 times.".
It looks like doing the regular directory traversal that filterSource does, but only creating a directory once we're about to put an actual file inside, is something that should already exist in Nix though... I can't be the first looking for this, so it must be in
lib.fileset
...