Open rski opened 1 year ago
1a29857b8a93f5259f0c2e919becc0bf9db24f85 is not it
current bisect status:
git bisect start
# status: waiting for both good and bad commits
# bad: [09720cc41f0dad446f119e3a6259c640d4b33003] Merge #235556: staging-next 2023-06-02
git bisect bad 09720cc41f0dad446f119e3a6259c640d4b33003
# status: waiting for good commit(s), bad commit known
# good: [a6c64b2c29b11b3a9206918a46a37a1c53cdf1a0] Merge pull request #230373 from gregod/photoprism-230506-9de9a3540
git bisect good a6c64b2c29b11b3a9206918a46a37a1c53cdf1a0
# bad: [9c289b427e36f1f317673e5456067de45f8bf2fe] Merge pull request #234994 from layus/autopatchelf-single-files
git bisect bad 9c289b427e36f1f317673e5456067de45f8bf2fe
# good: [9441fc25d1b6af4d2323549221e5eb17bb26f6bd] Merge staging-next into staging
git bisect good 9441fc25d1b6af4d2323549221e5eb17bb26f6bd
I'm very certain it's due to the corutils upgrade. There are even mentions of relevant things in the changelog: https://lists.gnu.org/archive/html/coreutils-announce/2023-04/msg00000.html
cp --reflink=auto (the default), mv, and install
will again fall back to a standard copy in more cases.
Previously copies could fail with permission errors on
more restricted systems like android or containers etc.
[bug introduced in coreutils-9.2]
cp --recursive --backup will again operate correctly.
Previousy it may have issued "File exists" errors when
it failed to appropriately rename files being replaced.
[bug introduced in coreutils-9.2]
cc @dasJ
semi-related, given the changelog, these might also need fixing:
rski@rski ~/C/n/nixpkgs ((e959b488))> rg "no-clobber"
pkgs/build-support/dotnet/make-nuget-source/default.nix
21: cp --no-clobber '{}' $out/lib ';'
pkgs/build-support/setup-hooks/move-lib64.sh
17: mv --no-clobber "$i" $prefix/lib
pkgs/build-support/docker/default.nix
780: cp -R --no-clobber inputs/*/* image/
nixos/modules/services/networking/znc/default.nix
295: cp --no-preserve=ownership --no-clobber ${cfg.configFile} ${cfg.dataDir}/configs/znc.conf
tried adding http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=blobdiff_plain;f=src/cp.c;h=00a5cb813711826102e8d3c7d41cf99b4b1b656f;hp=488770a0b6e963e3c876b44e0b5bb2bee0690941;hb=c6b1fe43474b48a6bf5793e11cc1d0d6e895fdf4;hpb=7223651ad194a5868b58c1be6c7452fd3ca2f75a to the coreutils patches, doesn't seem to work
i'm out of ideas for now, maybe the coreutils update needs to be reverted?
https://github.com/NixOS/nixpkgs/issues/244331 9.4 which will maybe fix the issue, the related bug threads are hard to follow.
Same here, trying to switch to 23.11. Can't build anything on company desktops. Thanks for the investigation work.
I tried using an overlay with an older version of coreutils, but then no binary cache (and compiling everything from scratch runs into other issues I haven't investigated).
Did you manage to find a workaround? Or even which patch of the bug threads actually fixes the issue (can't stay on 9.1 forever)?
I have the same problem on OpenShift, using this Nix image https://hub.docker.com/layers/nixos/nix/2.24.7/images/sha256-2bf4f7ad8306dc40fda7a1f8f40717fbdbb606b2425bd24c4d52cf4214588657?context=explore. @rski @GeoffreyFrogeye did you find a solution for this ?
I worked around the issue by doing the cardinal sin of modifying the nix store directly so the version of coreutils used is actually an older one. This needs to be re-applied on every coreutils upgrade.
Which coreutils version are you using? I thought it would have been fixed in 9.4. I can't really test anymore since my nix-on-NFS use case disappeared.
I tried with 9.4 and the problem persists. I ended up doing a hacky thing where I build everything on a persistent storage and then copy it over to nfs
Using 9.5 on my end.
one possible solution is[^1]:
https://github.com/rski/nixpkgs/commit/daabbf44c5ab54371873ca673014c86bf3fb86ca, making defaultUnpack do:
cp -r --reflink=auto -- "$fn" "$destination"
instead of cp -rp, perhaps even adding
--preserve=mode,timestamps
I'm not sure what would break if I changed these flags though, and I'd rather not be responsible for melting down the entire nix ecosystem
[^1] I haven't tested it because building on nfs is horribly slow, but I think it should work.
If it works it would be really nice !
It seems like mode is the issue, not ownership. A more minimal repro, taken from https://github.com/samdroid-apps/nix-articles/blob/master/04-proper-mkderivation.md that doesn't depend on buildGoModule,
{
inputs = {
# has the fix, disabling mode on cp
nixpkgs.url = "github:rski/nixpkgs/fdf6a2c96f5865215b185cdd81bcc94dde9c7778";
nixpkgs2.url = "github:rski/nixpkgs/bdac0fa35d69d2ea454deeb71b0c826aef53886c";
};
outputs =
{ nixpkgs, nixpkgs2, ... }:
let
pkgs = import nixpkgs { system = "x86_64-linux"; };
pkgs2 = import nixpkgs2 { system = "x86_64-linux"; };
in
{
packages = {
mytest = pkgs.stdenv.mkDerivation {
name = "example-website-content";
src = pkgs.fetchFromGitHub {
owner = "jekyll";
repo = "example";
rev = "5eb1b902ca3bda6f4b50d4cfcdc7bc0097bac4b7";
sha256 = "1jw35hmgx2gsaj2ad5f9d9ks4yh601wsxwnb17pmb9j02hl3vgdm";
};
installPhase = ''
# Build the site to the $out directory
export JEKYLL_ENV=production
'';
};
};
};
}
(and I'm feeling fine).
I bisected across 16k commits of nixpkgs on a nix store on an nfs drive, where evaluating the sample flake I had at hand took 3+ minutes. Please clap.
This is the flake I used as an example:
and with the command
rm -f flake.lock && nix build .#packages.gopls
(there's probably a better way to update flake.lock to reflect git checkouts in the local nixpkgs repo, but see the part about every nix eval taking 3+ minutes)At commit 09720cc41f0, the build fails, at
git checkout 09720cc41f0~1
, commit a6c64b2c29b things work again.I'm guessing the problem is https://github.com/NixOS/nixpkgs/pull/235556/commits/1a29857b8a93f5259f0c2e919becc0bf9db24f85, but I have not tried it yet.
Steps To Reproduce
Steps to reproduce the behavior:
Output logs of the failure ( nix log /nix/store/bvx11f2mbwhn86czn220520qwc3i52ps-gopls-unstable.drv)
At some point, I was also seeing
but not at the point of bisection I guess.
I have the exact same config on a NixOS laptop and an ubuntu laptop, where the store is on a not terrible partition, and it works fine there.
Some other things I tried:
use-sqlite-wal = false
to my nix.conf before recreating /nixI'm a bit focused on the NFS part, because I had trouble with nix on NFS in the past as well, and to me that's the big differentiator here.
Metadata
Please run
nix-shell -p nix-info --run "nix-info -m"
and paste the result.I also had nix2.11 installed, and that had the same problems.