NixOS / nix

Nix, the purely functional package manager
https://nixos.org/
GNU Lesser General Public License v2.1
12.66k stars 1.51k forks source link

Nix cached truncated files downloaded #4533

Open oxalica opened 3 years ago

oxalica commented 3 years ago

Describe the bug

In cases of some network error or proxy server error, downloading is ended too early and gives a truncated file. But the truncated file is still cached by nix, leading to immediate truncated gzip input every next time without refetching.

This issue happens because:

  1. The response of github tarball url does NOT contain Content-Length, which makes curl not possible to validate the output length.
  2. Some network error like proxy server being killed, will not RESET the connection. This makes curl returns zero but gives truncated file.
  3. Nix caches the output after curl returns zero, but before extracting. It results in cached truncated files.

Steps To Reproduce

  1. Setup a proxy server.
  2. Run nix-build -A osu-lazer https://github.com/r-ryantm/nixpkgs/archive/cc77f910fc1a06b7cb7eb43639c5904540483c70.tar.gz or with other github archive URL.
  3. During the download, kill the proxy server. nix will fail with truncated gzip input
  4. Run the command again, it immediately fails with truncated gzip input again without network access.
image

Expected behavior Nix caches the downloaded file only if it can be unpacked successfully. So it can re-download the file and re-build in the next time.

nix-env --version output nix-env (Nix) 2.4pre20201205_a5d85d0

edolstra commented 3 years ago

This sounds like a bug in the proxy server. There is not much we can do if it's serving corrupted files...

oxalica commented 3 years ago

@edolstra

This sounds like a bug in the proxy server.

kill -9 produces the same result. I think it's the kernel cleanup behavior (close instead of reset), which is not simple to control.

On nix side, extracting (or other verification) before caching is enough to fix this issue.

oxalica commented 3 years ago

Fixed in curl side https://github.com/curl/curl/commit/d1f40078c13e85c56332dcb7f908fe2a7b65eb22 We can just wait for the next curl release.

rembo10 commented 2 years ago

Hm, I think I'm experiencing this. I don't have a great internet connection at the moment, and I think the download was interrupted. Not using a proxy server, just nixpkgs.url = "nixpkgs/nixos-21.11" (my only input)

Now if I try to run it again I get the same error above:

error: failed to extract archive (truncated gzip input)

Using nix 2.4

corwin-of-amber commented 2 years ago

This kept happening and I was unable to figure out which file was truncated. The error does not specify the location and I was unable to turn on more verbosity (tried -L following this https://github.com/NixOS/nix/issues/1904, had no effect).

I ended up having to reinstall Nix. This is very bad user experience. Please reopen this issue.

charlesbaynham commented 2 years ago

I ran into this too (no proxy server involved) - I tried a nix-store --verify --repair --check-contents to repair my store but this didn't work for some reason.

In the end, I had to do nix-store --gc and clear out my cache completely.

dzmitry-lahoda commented 2 years ago
vscode ➜ /workspaces/composable (dz/byog-container) $ nix run github:ComposableFi/Composable/49473d1e4a86abe62abfad5648532dab3cef15ec#devnet-xcvm-up -L --show-trace --
error: failed to extract archive (truncated gzip input)

       … while fetching the input 'github:ComposableFi/Composable/49473d1e4a86abe62abfad5648532dab3cef15ec'

using cachix.

so half hour ago it worked on other machine. it is new cache and we have 1TB plan.

dzmitry-lahoda commented 2 years ago

nix-store --verify --repair --check-contents not helped

domenkozar commented 2 years ago

@dzmitry-lahoda are you able to nix-store --delete the offending store path?

dzmitry-lahoda commented 2 years ago

Nice idea, will try next time. I did gc of store. So cannot retest now. But I guess deleting specific package will work.

omnibs commented 1 year ago

Can confirm nix-store --delete [nix-store-path] works, but it's not terribly easy to figure out what that path is.

It happened to me on the initial fetch of nixpkgs, and through a lot of trial and error I figured out that would be in a /nix/store/*nixpkgs-src* path. Some of those paths were dirs, some were files. I guessed the dirs were successfully downloaded and unpacked ones, so I ignored those. Running gzip -t on the remaining paths helped me spot my truncated gzip.

jualvarez commented 1 year ago

@omnibs Disclaimer: Newbie on Nix here!

I had the same issue and found that running (in my case) nix develop with the --debug option would ouptut the exact path that I needed to delete.

ignoring disappeared cache entry '{"rev":"0a023762fc097047c0a16fa4d2bc3ef6012f4f44","type":"git-tarball"}'
ignoring disappeared cache entry '{"name":"source","type":"tarball","url":"https://api.github.com/repos/NixOS/nixpkgs/tarball/0a023762fc097047c0a16fa4d2bc3ef6012f4f44"}'
using cache entry '{"name":"source","type":"file","url":"https://api.github.com/repos/NixOS/nixpkgs/tarball/0a023762fc097047c0a16fa4d2bc3ef6012f4f44"}' -> '{"etag":"\"01fb19345a43fc5eec91d21637476fecd2906297c4c917c27ae83bd43c1607a4\"","url":"https://codeload.github.com/NixOS/nixpkgs/legacy.tar.gz/0a023762fc097047c0a16fa4d2bc3ef6012f4f44"}', '/nix/store/s1vb9z51ynxvi9c7q8398wsrv6yrj9vk-source'
error: failed to extract archive (truncated gzip input)

Then running

nix-store --delete /nix/store/s1vb9z51ynxvi9c7q8398wsrv6yrj9vk-source

Worked fine. But your post pointed me in the right direction. Thanks!

dzmitry-lahoda commented 1 year ago

Same here, provider is github:

    osmosis-src.flake = false;
    osmosis-src.url = github:osmosis-labs/osmosis/v16.1.1;

My collegue and I just suddenly start getting these.

dz@pop-os:~/github.com/informalsystems/cosmos.nix$ nix --version
nix (Nix) 2.16.0
dz@pop-os:~/github.com/informalsystems/cosmos.nix$ nix show-config
accept-flake-config = false
access-tokens = 
allow-dirty = true
allow-import-from-derivation = true
allow-new-privileges = false
allow-symlinked-store = false
allow-unsafe-native-code-during-evaluation = false
allowed-impure-host-deps = 
allowed-uris = 
allowed-users = *
auto-allocate-uids = false
auto-optimise-store = false
bash-prompt = 
bash-prompt-prefix = 
bash-prompt-suffix = 
build-hook = /nix/store/y49q7bwh5n5ybz8skxhpdypai032dsml-nix-2.16.0/bin/nix __build-remote
build-poll-interval = 5
build-users-group = nixbld
builders = @/etc/nix/machines
builders-use-substitutes = false
commit-lockfile-summary = 
compress-build-log = true
connect-timeout = 0
cores = 20
diff-hook = 
download-attempts = 5
download-speed = 0
eval-cache = true
experimental-features = flakes nix-command
extra-platforms = i686-linux x86_64-v1-linux x86_64-v2-linux x86_64-v3-linux
fallback = false
filter-syscalls = true
flake-registry = https://channels.nixos.org/flake-registry.json
fsync-metadata = true
gc-reserved-space = 8388608
hashed-mirrors = 
http-connections = 25
http2 = true
id-count = 8388608
ignore-try = false
ignored-acls = security.csm security.selinux system.nfs4_acl
impersonate-linux-26 = false
keep-build-log = true
keep-derivations = true
keep-env-derivations = false
keep-failed = false
keep-going = false
keep-outputs = false
log-lines = 10
max-build-log-size = 0
max-free = 18446744073709551615
max-jobs = 1
max-silent-time = 0
max-substitution-jobs = 16
min-free = 0
min-free-check-interval = 5
nar-buffer-size = 33554432
narinfo-cache-negative-ttl = 3600
narinfo-cache-positive-ttl = 2592000
netrc-file = /etc/nix/netrc
nix-path = /home/dz/.nix-defexpr/channels nixpkgs=/nix/var/nix/profiles/per-user/root/channels/nixpkgs /nix/var/nix/profiles/per-user/root/channels
plugin-files = 
post-build-hook = 
pre-build-hook = 
preallocate-contents = false
print-missing = true
pure-eval = true
require-sigs = true
restrict-eval = false
run-diff-hook = false
sandbox = relaxed
sandbox-build-dir = /build
sandbox-dev-shm-size = 50%
sandbox-fallback = true
sandbox-paths = /bin/sh=/nix/store/7b943a2k4amjmam6dnwnxnj8qbba9lbq-busybox-static-x86_64-unknown-linux-musl-1.35.0/bin/busybox
secret-key-files = 
show-trace = false
ssl-cert-file = /etc/ssl/certs/ca-certificates.crt
stalled-download-timeout = 300
start-id = 872415232
store = auto
substitute = true
substituters = https://cache.nixos.org/
sync-before-registering = false
system = x86_64-linux
system-features = benchmark big-parallel kvm nixos-test uid-range
tarball-ttl = 3600
timeout = 0
trace-function-calls = false
trace-verbose = false
trusted-public-keys = cache.nixos.org-1:6NCHdD59X431o0gWypbMrAURkbJ16ZPMQFGspcDShjY= nix-community.cachix.org-1:mB9FSh9qf2dCimDSUo8Zy7bkq5CX+/rkCWyvRCYg3Fs= composable-community.cachix.org-1:GG4xJNpXJ+J97I8EyJ4qI5tRTAJ4i7h+NK2Z32I8sK8= helix.cachix.org-1:ejp9KQpR1FBI2onstMQ34yogDm4OgU2ru6lIwPvuCVs= mitchellh-nixos-config.cachix.org-1:bjEbXJyLrL1HZZHBbO4QALnI5faYZppzkU4D2s0G8RQ=
trusted-substituters = https://cache.nixos.org/ https://composable-community.cachix.org/ https://devenv.cachix.org/ https://nix-community.cachix.org/
trusted-users = dz root dzmitry-lahoda
use-case-hack = false
use-cgroups = false
use-registries = true
use-sqlite-wal = true
use-xdg-base-directories = false
user-agent-suffix = 
warn-dirty = true

dz@pop-os:~/github.com/informalsystems/cosmos.nix$ 

dz@pop-os:~/github.com/informalsystems/cosmos.nix$ uname -a
Linux pop-os 5.19.0-76051900-generic #202207312230~1663791054~22.04~28340d4 SMP PREEMPT_DYNAMIC Wed S x86_64 x86_64 x86_64 GNU/Linux
pwaller commented 1 year ago

Shouldn't there be a nar hash verified before corrupted files make it as far downstream as being extracted?

dzmitry-lahoda commented 1 year ago

I got this reproduced. So actually it happens when download just stops (no only via nix, that is issue of GH for some items).

But when it happens, I get this

image

I do nix store delete and use path from -L -debug. And it starts again and fails.

So I think it is nix issue because it must not store partially downloaded files in store. This violates integrity of it for static derivation with well know hashes it is seems no acceptable.

I am on nix 2.17.

When GH started to work Ok, I cleaned store, did CURL well. But nix still was stucking. Like if it caching bad internet connection. I restared nix daemon.

Nix consistently stucks on

f25d04f19d6302e0bdd585ba1d7a2cc96e397.tar.gz"}'
downloading 'https://github.com/osmosis-labs/osmosis/archive/1c5f25d04f19d6302e0bdd585ba1d7a2cc96e397.tar.gz'...
starting download of https://github.com/osmosis-labs/osmosis/archive/1c5f25d04f19d6302e0bdd585ba1d7a2cc96e397.tar.gz
[3.3/0.0 MiB DL] downloading 'https://github.com/osmosis-labs/osmosis/archive/1c5f25d04f19d6302e0bdd585ba1d7a2cc96e397.tar

same value all the time

but after minute I got it

did not find cache entry for '{"name":"source","type":"tarball","url":"https://github.com/osmosis-labs/osmosis/archive/1c5f25d04f19d6302e0bdd585ba1d7a2cc96e397.tar.gz"}'
performing daemon worker op: 11
performing daemon worker op: 1
ignoring disappeared cache entry '{"name":"source","type":"file","url":"https://github.com/osmosis-labs/osmosis/archive/1c5f25d04f19d6302e0bdd585ba1d7a2cc96e397.tar.gz"}'
downloading 'https://github.com/osmosis-labs/osmosis/archive/1c5f25d04f19d6302e0bdd585ba1d7a2cc96e397.tar.gz'...
starting download of https://github.com/osmosis-labs/osmosis/archive/1c5f25d04f19d6302e0bdd585ba1d7a2cc96e397.tar.gz
[64.7/84.6 MiB DL] downloading 'https://github.com/osmosis-labs/osmosis/archive/1c5f25d04f19d6302e0bdd585ba1d7a2cc96e397.t

fine. feels like some Nix prameters and usage of curl lead to issue, but nix should not cache bad files anyway.

dzmitry-lahoda commented 1 year ago

@edolstra

This sounds like a bug in the proxy server. There is not much we can do if it's serving corrupted files...

this is not the case. issue with nix handling downloads.

CorbanR commented 1 year ago

I am running into the exact same issue

 - system: `"aarch64-darwin"`
 - host os: `Darwin 22.6.0, macOS 13.5.1`
 - multi-user?: `yes`
 - sandbox: `no`
 - version: `nix-env (Nix) 2.17.0`
 - channels(root): `"nixpkgs"`
 - nixpkgs: `/nix/var/nix/profiles/per-user/root/channels/nixpkgs`

I see it when trying to run

nix search nixpkgs#rclone
error:
       … while fetching the input 'github:NixOS/nixpkgs/nixpkgs-unstable'

       error: cannot get archive member name: truncated gzip input

I have an alias that runs sudo nix-collect-garbage && nix-collect-garbage && sudo nix-store --verify --check-contents --repair && sudo nix-store --optimise which seems to fix the issue. Although that is usually my last resort command when I'm running into issues.

flemzord commented 1 year ago

I have the same problem in my CI: https://github.com/formancehq/stack/actions/runs/6077877454/job/16488949571

domenkozar commented 1 year ago

This is an issue with github that's happening since yesterday.

cor commented 1 year ago

This is an issue with github that's happening since yesterday.

Same for us

PlumpMath commented 1 year ago

I wonder when it will be over?;;; I've been overhauling the entire flake.nix config settings because it hasn't been updated for several days. I mainly removed all the repositories that took a long time during the nix flake update. Lol. It works well on the MacBook I have, but it doesn't install on the Cylinder Mac Pro... NixOS works well on other PCs. Hmm... I'm switching all darwin systems to determinate systems with nix. Anyway, it's only being installed and updated properly on a MacBook intel-mac. I'm not sure what the difference is, but there's no doubt that the GitHub server has gone haywire. So, I got separate access-tokens for each computer. The first one I received was for the MacBook.

jimmidyson commented 1 year ago

Latest from https://status.github.com:

We have mitigated the impact on download and raw file operations and are seeing recovery on response times but are continuing to monitor.

:crossed_fingers:

PlumpMath commented 1 year ago

For the Mac Pro 2013 (which can only be installed up to Monterey), I wondered why I was getting that error. Once I removed commercial-emacs, the build started without any issues. It's uncertain whether the nix community will offer the option to select commercial-emacs, but in any case, others might find this information useful. Naturally, there were no issues on other MacBooks with the latest Ventura installed. It seems there might be an issue when using repository sources not managed by nixpkgs, considering both the Mac version and the fact that there was an error on nixos as well.

duijf commented 1 year ago

If you used builtins.fetchTarball and you know which archive is broken, this is how you can fix this on a machine:

$ cat fetch.nix
builtins.fetchTarball {
  sha256 = "sha256-6GQ9ib4dA/r1leC5VUpsBo0BmDvNxLjKrX1iyL+h8mc=";
  url = "https://github.com/NixOS/nixpkgs/archive/e43e2448161c0a2c4928abec4e16eae1516571bc.tar.gz";
}

# Repro: if you get this error, you know you have the right archive:
$ nix-instantiate fetch.nix
error: cannot get archive member name: truncated gzip input

# Force nix to download the file again:
$ sudo nix-instantiate --option tarball-ttl 0 fetch.nix
duijf commented 1 year ago

This looks very similar to behaviour you see in cache-poisoning related issues.

Regardless of how the file was originally downloaded / the upstream server behaved, the local Nix CLI / daemon should not cache invalid archives. In our particular case, we observed that Nix was caching these invalid archives for 1.5 days

tylerd-canva commented 1 year ago

My understanding is part of the issue here is that fetchTarball only verifies the checksum after it is extracted. In this case the content is incomplete -- an invalid archive -- and therefore fails to extract. The checksum is never validated. But either way, it's weird to cache an artifact that never makes it to the checksum step.

ditsuke commented 11 months ago

Ran into this today, and indeed I would expect a truncated cached asset to be invalidated or not be stored in the cache at all until it's validated.