ocaml / opam

opam is a source-based package manager. It supports multiple simultaneous compiler installations, flexible package constraints, and a Git-friendly development workflow.
https://opam.ocaml.org
Other
1.25k stars 361 forks source link

opam update, Could not update repository "default": Failed to extract archive #5484

Open jteg68 opened 1 year ago

jteg68 commented 1 year ago

Hi, an 'opam update' gives an error 2. Any ideas on how to investigate this?

teg@hilbert:~$ opam update

<><> Updating package repositories ><><><><><><><><><><><><><><><><><><><><><><>
[ERROR] Could not update repository "default": Failed to extract archive /tmp/opam-17061-2b9851/index.tar.gz:
        "/usr/bin/tar xfz /tmp/opam-17061-2b9851/index.tar.gz -C /tmp/opam-17061-edec5a/default.new" exited with
        code 2

teg@hilbert:~$ opam update --debug-level=10
00:00.036  FILE(config)           Read ~/.opam/config in 0.000s
00:00.036  CLI                    Parsing CLI version 2.1
00:00.036  GSTATE                 LOAD-GLOBAL-STATE @ /home/teg/.opam
00:00.037  SYSTEM                 LOCK /home/teg/.opam/lock (none => read)
00:00.037  SYSTEM                 LOCK /home/teg/.opam/config.lock (none => write)
00:00.037  FILE(config)           Read ~/.opam/config in 0.000s
00:00.037  CLIENT                 UPDATE 
00:00.037  RSTATE                 LOAD-REPOSITORY-STATE @ /home/teg/.opam
00:00.037  FILE(repos-config)     Read ~/.opam/repo/repos-config in 0.000s
00:00.038  SYSTEM                 LOCK /home/teg/.opam/repo/state-11ABC690.cache (none => read)
00:00.164  CACHE(repository)      Loaded /home/teg/.opam/repo/state-11ABC690.cache in 0.126s
00:00.164  SYSTEM                 LOCK /home/teg/.opam/repo/state-11ABC690.cache (read => none)
00:00.164  RSTATE                 Cache found
00:00.164  STATE                  LOAD-SWITCH-STATE @ 5.0.0
00:00.164  FILE(switch-config)    Read ~/.opam/5.0.0/.opam-switch/switch-config in 0.000s
00:00.164  FILE(switch-state)     Read ~/.opam/5.0.0/.opam-switch/switch-state in 0.000s
00:00.164  SYSTEM                 LOCK /home/teg/.opam/5.0.0/.opam-switch/packages/cache (none => read)
00:00.164  CACHE(installed)       Loaded /home/teg/.opam/5.0.0/.opam-switch/packages/cache in 0.000s
00:00.164  SYSTEM                 LOCK /home/teg/.opam/5.0.0/.opam-switch/packages/cache (read => none)
00:00.699  FILE(.config)          Read ~/.opam/5.0.0/.opam-switch/config/ocaml.config in 0.000s
00:00.699  STATE                  Switch state loaded in 0.535s
00:00.699  SYSTEM                 LOCK /home/teg/.opam/repo/lock (none => write)
00:00.699  SYSTEM                 LOCK /home/teg/.opam/repo/lock (none => write)

<><> Updating package repositories ><><><><><><><><><><><><><><><><><><><><><><>
00:00.703  PARALLEL               Iterate over 1 task(s) with 3 process(es)
00:00.703  PARALLEL               Starting job 0 (worker 1/3): 0
00:00.703  SYSTEM                 mkdir /tmp/opam-17122-e3a56e
00:06.658  REPOSITORY             update default from https://opam.ocaml.org
00:06.658  CURL                   pull-repo-update
00:06.658  SYSTEM                 mkdir /tmp/opam-17122-e3a56e/default.new
00:06.658  SYSTEM                 mkdir /tmp/opam-17122-46fd0e
00:06.659  PARALLEL               Next task in job 0: /usr/bin/curl --write-out %{http_code}\n --retry 3 --retry-delay 2 --user-agent opam/2.1.2 -L -o /tmp/opam-17122-46fd0e/index.tar.gz.part -- https://opam.ocaml.org/index.tar.gz
Processing  1/1: [default: http]
00:14.034  PARALLEL               Collected task for job 0 (ret:0)
00:14.040  SYSTEM                 [log-17122-f45b9b] (in 0.005s) mv /tmp/opam-17122-46fd0e/index.tar.gz.part /tmp/opam-17122-46fd0e/index.tar.gz
00:14.040  PARALLEL               Next task in job 0: /usr/bin/tar xfz /tmp/opam-17122-46fd0e/index.tar.gz -C /tmp/opam-17122-e3a56e/default.new
00:19.604  PARALLEL               Collected task for job 0 (ret:2)
00:19.614  SYSTEM                 rmdir /tmp/opam-17122-46fd0e
00:19.619  SYSTEM                 rmdir /tmp/opam-17122-e3a56e/default.new
[ERROR] Could not update repository "default": Failed to extract archive /tmp/opam-17122-46fd0e/index.tar.gz:
        "/usr/bin/tar xfz /tmp/opam-17122-46fd0e/index.tar.gz -C /tmp/opam-17122-e3a56e/default.new" exited with
        code 2
00:21.366  PARALLEL               Job 0 finished
00:21.366  FILE(repos-config)     Wrote /home/teg/.opam/repo/repos-config in 0.000s
00:21.366  SYSTEM                 rm /home/teg/.opam/repo/state-11ABC690.cache
00:21.373  SYSTEM                 LOCK /home/teg/.opam/repo/state-11ABC690.cache (none => write)
00:21.373  CACHE(repository)      Writing the repository cache to ~/.opam/repo/state-11ABC690.cache ...
00:22.029  CACHE(repository)      ~/.opam/repo/state-11ABC690.cache written in 0.656s
00:22.029  SYSTEM                 LOCK /home/teg/.opam/repo/state-11ABC690.cache (write => none)
00:22.029  SYSTEM                 LOCK /home/teg/.opam/repo/lock (write => none)
00:22.108  SYSTEM                 LOCK /home/teg/.opam/5.0.0/.opam-switch/lock (none => none)
00:22.108  SYSTEM                 rmdir /tmp/opam-17122-e3a56e/default
00:25.562  SYSTEM                 rmdir /tmp/opam-17122-e3a56e
00:25.566  SYSTEM                 LOCK /home/teg/.opam/repo/lock (none => none)
00:25.566  SYSTEM                 LOCK /home/teg/.opam/config.lock (write => none)

The man pages for tar (GNU tar 1.34) gives this for return code 2

2      Fatal error.  This means that some fatal, unrecoverable error occurred.

The file system have a lot of room:

teg@hilbert:~$ df -H /tmp
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda5       2.0G   97M  1.8G   6% /tmp
teg@hilbert:~$ opam config report
# opam config report
# opam-version         2.1.2 
# self-upgrade         no
# system               arch=x86_64 os=linux os-distribution=debian os-version=12
# solver               builtin-mccs+glpk
# install-criteria     -removed,-count[avoid-version,changed],-count[version-lag,request],-count[version-lag,changed],-count[missing-depexts,changed],-changed
# upgrade-criteria     -removed,-count[avoid-version,changed],-count[version-lag,solution],-count[missing-depexts,changed],-new
# jobs                 3
# repositories         1 (http), 1 (version-controlled) (default repo at ed47929e)
# pinned               0
# current-switch       5.0.0
# ocaml:native         true
# ocaml:native-tools   true
# ocaml:native-dynlink true
# ocaml:stubsdir       /home/teg/.opam/5.0.0/lib/ocaml/stublibs:/home/teg/.opam/5.0.0/lib/ocaml
# ocaml:preinstalled   false
# ocaml:compiler       5.0.0
kit-ty-kate commented 1 year ago

Could you give us the output the df -ih ?

jteg68 commented 1 year ago

teg@hilbert:~$ df -ih Filesystem Inodes IUsed IFree IUse% Mounted on udev 971K 511 971K 1% /dev tmpfs 981K 1.1K 980K 1% /run /dev/sda2 1.5M 373K 1.1M 25% / tmpfs 981K 1 981K 1% /dev/shm tmpfs 981K 5 981K 1% /run/lock /dev/sda3 597K 19K 579K 4% /var /dev/sda5 120K 23K 97K 20% /tmp /dev/sda1 0 0 0 - /boot/efi /dev/sda6 115M 260K 114M 1% /home tmpfs 197K 131 196K 1% /run/user/1001 tmpfs 197K 142 196K 1% /run/user/1000

I copied the downloaded index.tar.gz from the /tmp/opam-xxx-xxx directory and extracted in a fresh directory under /tmp. No problem detected, the extracted files looks ok.

kit-ty-kate commented 1 year ago

/dev/sda5 120K 23K 97K 20% /tmp

mmh 97K inodes free might be on the low side for the opam-repository Testing locally, after untaring https://opam.ocaml.org/index.tar.gz I've already used 61K.

tmpfs            992K   61K  931K    7% /tmp/test

Could you try to see if an increase of the number of inode fixes your problem? sudo mount -t tmpfs tmpfs /tmp should give you a bit more (you can unmount it after and get back the files that were there before)

dbuenzli commented 1 year ago

Btw. just thinking out loud. Would it maybe make sense for opam to just work on the "file system" of the index.tar file ?

jteg68 commented 1 year ago

Thanks, with more inodes the update sailed through. Working directly on the index.tar file, is there a cmd line option for that?

dbuenzli commented 1 year ago

@jteg68 no that was just me thinking very loudly.

kit-ty-kate commented 1 year ago

Thanks, with more inodes the update sailed through.

Awesome, good to know.

Looking at the difference between 2.1 and master, it seems like your issue should be fixed by using master:

git clone https://github.com/ocaml/opam
make -C opam cold
sudo install ./opam/opam /usr/local/bin/opam

in master we untar directly into the opam root (~/.opam in your case) so /tmp isn't used as much.

Would it maybe make sense for opam to just work on the "file system" of the index.tar file ?

Mmmh, that's interesting, we could use tar -d but sadly this option seem to be a bit finicky and not portable (only present in GNU tar). Were you thinking of something else?

dbuenzli commented 1 year ago

Were you thinking of something else?

I have no idea how opam uses all the info so take the following with a bag of salt.

But I mean I don't know if it's useful to have all these files untared on my disk. Just mmap the tar file and navigate the tar format.

MdeLv commented 1 year ago

Hi +1 Still don't know how to fix that. One of my debian11/opam 2.1.4 server works well with opam --update. Another one is stuck for a few days with the same message as @jteg68 got.

Doing opam init --reinit -ni does the same, or reinstalling opam triggers the same error:

$ bash -c "sh <(curl -fsSL https://raw.githubusercontent.com/ocaml/opam/master/shell/install.sh)"
## Downloading opam 2.1.4 for linux on x86_64...
## Downloaded.
## Where should it be installed ? [/usr/local/bin] 
Write access to /usr/local/bin required, using 'sudo'.
Command: rm -f /usr/local/bin/opam
Write access to /usr/local/bin required, using 'sudo'.
Command: install -m 755 /tmp/opam-2.1.4-x86_64-linux /usr/local/bin/opam
## opam 2.1.4 installed to /usr/local/bin
## Converting the opam root format & updating
No configuration file found, using built-in defaults.
Checking for available remotes: rsync and local, git.
  - you won't be able to use mercurial repositories unless you install the hg command on your system.
  - you won't be able to use darcs repositories unless you install the darcs command on your system.

<><> Updating repositories ><><><><><><><><><><><><><><><><><><><><><><><><><><>
[multicore] no changes from git+https://github.com/ocamllabs/multicore-opam.git
[ERROR] Could not update repository "default": Failed to extract archive /tmp/opam-3628659-e3393d/index.tar.gz: "/usr/bin/tar xfz /tmp/opam-3628659-e3393d/index.tar.gz -C /tmp/opam-3628659-bfd99e/default.new"
        exited with code 2

I tried with 4 different so-called "switches" and got the same result.

I can manually get and extract the opam index: wget https://opam.ocaml.org/index.tar.gz && tar xfz index.tar.gz -C tmp_opam/ btw, I don't know why a simple opam update takes so much time?

There is enough storage on /tmp and /home (GB).

What information do you need to know exactly what is going on with this opam configuration? I hope we are only a few people impacted by this trouble. Thanks.

ETA: I remembered I had a "no space left on device" error on a low resources debian server. I had to find where are opam logs to investigate that possibility (btw I found no information about log location in opam helpand not even the word "log" or each page of eh official documentation at https://opam.ocaml.org/doc/)

$ ll -h -tr ~/.opam/log/
-rw-r--r-- 1 ml ml  309 Mar 17 12:15 log-3628842-be11d0.info
-rw-r--r-- 1 ml ml 4.2K Mar 17 12:15 log-3628842-be11d0.env
-rw-r--r-- 1 ml ml 7.2M Mar 17 12:15 log-3628842-be11d0.out

There is space available on device ((opam used to use /tmp and uses /home):

df -h
Size  Used Avail Use% Mounted on
7.8G  4.0K  7.8G   1% /dev
1.6G  3.5M  1.6G   1% /run
 23G   20G  2.3G  90% /
1.8G  184M  1.6G  11% /tmp
9.1G  5.4G  3.3G  63% /var
304G  249G   41G  86% /home
...

However, I'm surprised to see the following logs:

$ tail  ~/.opam/log/log-3628842-be11d0.out 
/usr/bin/tar: packages/nlopt-ocaml: Cannot mkdir: No space left on device
/usr/bin/tar: packages/nlopt-ocaml/nlopt-ocaml.0.4: Cannot mkdir: No such file or directory
/usr/bin/tar: packages/nlopt-ocaml: Cannot mkdir: No space left on device
/usr/bin/tar: packages/nlopt-ocaml/nlopt-ocaml.0.4/opam: Cannot open: No such file or directory
/usr/bin/tar: packages/nlopt-ocaml: Cannot mkdir: No space left on device
/usr/bin/tar: packages/nlopt-ocaml/nlopt-ocaml.0.5.1: Cannot mkdir: No such file or directory
/usr/bin/tar: packages/nlopt-ocaml: Cannot mkdir: No space left on device
/usr/bin/tar: packages/nlopt-ocaml/nlopt-ocaml.0.5.1/opam: Cannot open: No such file or directory
/usr/bin/tar: repo: Cannot open: No space left on device
/usr/bin/tar: Exiting with failure status due to previous errors

As /tmp and /home appear to have enough space available, I can't tell what happens. Maybe just a tar error related to a non existent dir (see -C option used with tar)?...

btw, why is this precious as well as hidden log information not displayed when using opam update with --verbose?

kit-ty-kate commented 1 year ago

@MdeLv sorry you're experiencing the same issue. Please read the discussion above, the solution(s) and cause of the issue (lack of inodes, not lack of space) are all there already. If you want a summary, the solutions are either:

you can install opam master with these 3 simple commands:

git clone https://github.com/ocaml/opam
make -C opam cold
sudo install ./opam/opam /usr/local/bin/opam
MdeLv commented 1 year ago

Thanks for your quick answer.

Solution 1 worked for me.

$ df -ih
Inodes IUsed IFree IUse% Mounted on
120K   43K   77K   36% /tmp
  ...

sudo mount -t tmpfs tmpfs /tmp

$ df -ih
Inodes IUsed IFree IUse% Mounted on
2.0M     1  2.0M    1% /tmp
...

Reinstalling opam the curl way (to override the installed binary I just compiled):

bash -c "sh <(curl -fsSL https://raw.githubusercontent.com/ocaml/opam/master/shell/install.sh)"
...
<><> Updating repositories ><><><><><><><><><><><><><><><><><><><><><><><><><><>
[default] synchronised from https://opam.ocaml.org
[multicore] no changes from git+https://github.com/ocamllabs/multicore-opam.git

Now OK! (same wih opam update) Thanks.

btw, just before your message, I was trying the solution 2 (git clone https://github.com/ocaml/opam && make -C opam cold && sudo install ./opam/opam /usr/local/bin/opam). As it is updating the repo from 2.1 to 2.2 alpha, it is asked to confirm that non reversible action, without proposing a backup. What is the recommended way to backup an opam repository on the host? Is it enough to consider ~/.opam and to make an archive from it?

MdeLv commented 1 year ago

!!! Breaking opam commands !!!

I initially had a side-comment about doing a backup of the opam repository configuration of the host before upgrading from 2.1 to 2.2 alpha. I can see 107 GB in ~/.opam for 10 compilers and package sets, which seems huge to me.

opam switch list triggered an unusual message:

$ opam switch list
[ERROR] Opam has not been initialised, please run `opam init'

Then:

$ opam init
No configuration file found, using built-in defaults.
Checking for available remotes: rsync and local, git.
  - you won't be able to use mercurial repositories unless you install the hg
    command on your system.
  - you won't be able to use darcs repositories unless you install the darcs
    command on your system.
...
<><> Creating initial switch 'default' (invariant ["ocaml" {>= "4.05.0"}] - initially with ocaml-base-compiler) 

During this compilation, I discovered that my compiler opam configurations disappeared. See details: https://github.com/ocaml/opam/issues/5485

rjbou commented 1 year ago

But I mean I don't know if it's useful to have all these files untared on my disk. Just mmap the tar file and navigate the tar format.

There is an issue about using tar library instead of the command, with that it would be possible to read loaded files without writing them.

jteg68 commented 1 year ago

Hi again @kit-ty-kate ,

I just tried upgrading opam to master but the same problem persists. Was the change to use opam root reverted? Trying with a hard coded path in opamRepositoryState.ml, around line 173, the opam update works without issue.

jruere commented 9 months ago

Changing the temp path worked for me: TMPDIR=~/tmp/ opam update.