haskell / cabal

Official upstream development repository for Cabal and cabal-install
https://haskell.org/cabal
Other
1.62k stars 697 forks source link

"truncated tar archive" on cabal update #5518

Open AndreasPK opened 6 years ago

AndreasPK commented 6 years ago

I had some connection issues lately which seems to have caused cabal to only partially download some files when updating the index.

This results in the error below, which gives no indication on how to fix this.

Andi@Horzube MINGW64 ~/trees/someCalls
$ cabal new-update
Downloading the latest package list from hackage.haskell.org
truncated tar archive

After some search I found that deleting .cabal/packages fixes the issue.

Ideally cabal when detecting this issue should either invalidate the cache and redownload it or at least give the user the path to the offending file.

hvr commented 6 years ago

@AndreasPK do you happen to have saved the packages folder before removing it? it's a bit hard to figure out what exactly happened with the information you gave us

AndreasPK commented 6 years ago

do you happen to have saved the packages folder before removing it?

Sadly I did not think of that.

I did find these related issues.

If I hit the issue again I will hopefully remember to keep it around.

AndreasPK commented 6 years ago

I remembered!

Deleting packages/head.hackage* did not fix the issue. Deleting packages/ did.

LucianU commented 5 years ago

I've also hit this issue by interrupting cabal new-update with Ctrl-C. After that, even running cabal new-install hlint would trigger the same error about "truncated tar archive".

What tar file does cabal try to read before installing something?

harendra-kumar commented 5 years ago

I have started facing the same issue. I thought this was due to some temporary connection issue, but it does not seem to go away while the network connectivity seems perfect for all other tasks. I have noticed the error when trying to use cabal update/new-update as well as cabal list, not sure what other commands are affected.

harendra-kumar commented 5 years ago

@hvr there seem to be two issues here:

1) cabal does not provide good error messages in general, in this particular case it just said:

cutlass:/vol/hosts/cueball/workspace/packages/streamly-dom  (master)$ cabal new-update
Downloading the latest package list from hackage.haskell.org
truncated tar archive

It could have at least told us what file it is reading which it thinks is truncated. It is easier to diagnose a problem that way, dtruss was also not working for me. So, I just guessed that it might the index tar. I found there are two tars in the packages directory 00-* and 01-*. I saw that the size of 01-*.tar is much smaller compare to the 00-*.tar. So thinking that it must have been truncated at some point, I removed those (rm 01-index.*) and the problem went away. I still have the old index files but I think the problem is obvious, at some point the full file was not downloaded due to a network problem.

2) It becomes a deadlock situation for cabal new-udpate. For it to work, the old tar must be good, so if it gets corrupted even once it stops working forever. If it is corrupted it can suggest the user to remove that file. Better still, its working should not depend on the old tar being good and it can just overwrite it by a new update, as this is just a cache.

hvr commented 5 years ago

@harendra-kumar Yes, totally agree the error message is insufficient...

Do you happen to know the filesizes of the faulty 01-index.tar and the 01-index.tar.gz files? In principle cabal was supposed to auto-recover from corrupted package indices with the latest hackage-security version, maybe there's still some corner case that hasn't been dealt with.

harendra-kumar commented 5 years ago

@hvr yes, the sizes of the truncated ones first:

cutlass:~/cabal-corrupt$ lal
total 316952
drwxr-xr-x   13 harendra  staff       442 Jan 14 04:12 .
drwxr-xr-x+ 127 harendra  staff      4318 Jan 14 04:04 ..
-rw-r--r--    1 harendra  staff   5029694 Jan 12 05:53 01-index.cache
-rw-r--r--    1 harendra  staff  76967200 Jan 13 09:06 01-index.tar
-rw-r--r--    1 harendra  staff  76405323 Jan 13 09:06 01-index.tar.gz
-rw-r--r--    1 harendra  staff   3780566 Jan 12 05:22 01-index.tar.idx
-rw-r--r--    1 harendra  staff         0 Sep  2 06:57 01-index.tar19891-6.gz
-rw-r--r--    1 harendra  staff     88306 Jan 13 09:06 01-index.tar48917-7.gz
-rw-r--r--    1 harendra  staff         0 Dec 23  2017 01-index.tar52064-6.gz
-rw-r--r--    1 harendra  staff         0 Dec 22  2017 01-index.tar83117-9.gz
-rw-r--r--    1 harendra  staff         0 Dec 22  2017 01-index.tar90652-6.gz
-rw-r--r--    1 harendra  staff         4 Jan 13 09:06 01-index.timestamp

The sizes after an update:

cutlass:~/cabal-corrupt$ lal ~/.cabal/packages/hackage.haskell.org/01-index.*
-rw-r--r--  1 harendra  staff    5047361 Jan 14 03:49 /Users/harendra/.cabal/packages/hackage.haskell.org/01-index.cache
-rw-r--r--  1 harendra  staff  565596672 Jan 14 03:49 /Users/harendra/.cabal/packages/hackage.haskell.org/01-index.tar
-rw-r--r--  1 harendra  staff   76423383 Jan 14 02:58 /Users/harendra/.cabal/packages/hackage.haskell.org/01-index.tar.gz
-rw-r--r--  1 harendra  staff    3781558 Jan 14 02:58 /Users/harendra/.cabal/packages/hackage.haskell.org/01-index.tar.idx
-rw-r--r--  1 harendra  staff          4 Jan 14 03:49 /Users/harendra/.cabal/packages/hackage.haskell.org/01-index.timestamp

It seems only index.tar is truncated, index.tar.gz seems to be fine. Maybe we could have recovered by just using gunzip on it again. In fact I tried gunzip on it and it was successful and generated a tar of the same size as now. Strangely the size of the corrupted .tar is very close to the size of the .tar.gz. I tried to check if the .tar.gz accidentally got copied to .tar but I found it was not in gzip format, so that is not the case.

jgm commented 5 years ago

Ran into the same issue when I ctrl-C'd an update. Workaround of gunzip 01-index.tar.gz worked. But this is a serious problem which could really trip people up, and it should be fixed.

23Skidoo commented 5 years ago

So this is basically due to some file writes being non-atomic?

gbaz commented 5 years ago

A good improvement would be to make file writes atomic (with an approach sort of like http://hackage.haskell.org/package/safeio-0.0.5.0/docs/src/System-IO-SafeWrite.html#withOutputFile).

But it would be good to make the code hardened against bad states of these files regardless (i.e. even if the reason they were bad wasn't due to partial writes).

hvr commented 5 years ago

Fwiw, iirc we already try to do this atomically in some places by creating a temporary files (you might have noticed files such as timestamp28532-1.json cluttering the package cache folder) which gets moved over to the proper filename atomically. But still, I agree strongly with Gershom that we should also focus on recovering from corrupt states as that's going to cover a lot more of scenarios.

nh2 commented 3 years ago

Ran into the same issue when I ctrl-C'd an update.

This is still a problem 3 years later.

For me rm ~/.cabal/packages/hackage.haskell.org/01-index.* did not help, it brought me the error of https://github.com/haskell/cabal/issues/4987.

I had to do rm -r ~/.cabal/packages/hackage.haskell.org/ to make it work again.

mauke commented 9 months ago

This is still broken in 3.10.2.1.

$ cabal --version
cabal-install version 3.10.2.1
compiled using version 3.10.2.1 of the Cabal library
$ cabal update
Downloading the latest package list from hackage.haskell.org
truncated tar archive
gbaz commented 9 months ago

The above change wouldn't prevent the "truncated tar archive" message, but would simply make it nonfatal in the case it was recoverable (i.e. invalidate the offending file and continue). I can't tell if that's what's going on from your logs or not...

mauke commented 9 months ago

After that message, cabal exits and I'm back at the shell prompt. echo $? indicates it exited with a non-zero status. No idea which file is the "offending" one because cabal won't say.

I ran it 3 or 4 times and it is stuck in that state, making no progress.

In the end, I "fixed" it by doing rm -r ~/.cabal/packages/hackage.haskell.org/.

gbaz commented 9 months ago

Gotcha. I think the above PR must not have caught all failing cases. Anyone who wants to further investigate and patch (hopefully along the same lines) would be very welcome.

LucianU commented 7 months ago

Is the index required for cabal to operate? I haven't looked at the code in detail, so I hope somebody can clarify this instead.

I'm trying to figure out if there are states in which cabal must stop running and throw an error or show a descriptive message.

gbaz commented 7 months ago

The index is required for the solver to operate.

LucianU commented 6 months ago

@gbaz, this is a slightly different issue, but I notice that, if I don't have an index, cabal only shows a warning and later throws an error when it can't find a library. Wouldn't it make more sense for it to stop execution as soon as it can't find the index, since it needs it to operate?

Also, to mention here what I said in #4987, it looks like the following files can break cabal update. That is, only after I deleted all of them cabal update was able to generate a new index:

.rw-r--r-- 7.9M lucian 25 Apr 17:38 01-index.cache
.rw-r--r-- 904M lucian 25 Apr 17:38 01-index.tar
.rw-r--r-- 121M lucian 25 Apr 17:38 01-index.tar.gz
.rw-r--r-- 5.3M lucian 25 Apr 17:38 01-index.tar.idx
.rw-r--r--    4 lucian 25 Apr 17:38 01-index.timestamp
.rw-r--r--  970 lucian 25 Apr 17:38 snapshot.json
.rw-r--r--  464 lucian 25 Apr 17:38 timestamp.json
soulomoon commented 5 months ago

happened to me again , this fix it.

Ran into the same issue when I ctrl-C'd an update.

This is still a problem 3 years later.

For me rm ~/.cabal/packages/hackage.haskell.org/01-index.* did not help, it brought me the error of #4987.

I had to do rm -r ~/.cabal/packages/hackage.haskell.org/ to make it work again.