NixOS / nix

Nix, the purely functional package manager
https://nixos.org/
GNU Lesser General Public License v2.1
12.27k stars 1.48k forks source link

Flake: cannot extract input tarball / zip that has more than 1 top-level directory #7083

Open dixslyf opened 2 years ago

dixslyf commented 2 years ago

Describe the bug

When specifying a flake input that's a tarball or zip, Nix complains that the archive contains an unexpected number of top-level files if the archive contains more than 1 top-level directory.

This seems to have been brought up before in this discourse thread back in February. The root cause is from the fetchTarball fetcher; in particular this section. I'm not sure what the rationale is for expecting only 1 top-level directory, but in the context of flake inputs, it can be quite a blocker when fetching archives from other parties as we cannot control how they're structured.

Steps To Reproduce Example flake input:

catppuccin-gtk-macchiato = {
  url = "https://github.com/catppuccin/gtk/raw/fc336313a84e0d7ec1a3499047fb1e73eef8a005/Releases/Catppuccin-Macchiato.zip";
  flake = false;
};

When attempting to build, the following error is shown:

error: tarball 'https://github.com/catppuccin/gtk/raw/fc336313a84e0d7ec1a3499047fb1e73eef8a005/Releases/Catppuccin-Macchiato.zip' contains an unexpected number of top-level files

Expected behavior

The tarball / zip should be extracted without any errors.

nix-env --version output nix-env (Nix) 2.11.0

thufschmitt commented 2 years ago

Indeed.

I'm not sure what the rationale is for expecting only 1 top-level directory

The reason for that is that it's apparently a very common thing to have all your archive contents under one top-level directory (in particular that's what git archive - and github and most hosting sites - do). So the logic in Nix and in nixpkgs's fetchZip is to extract the archive and cd into that unique directory.

I think that behavior is the right one (it's what people will spontaneously expect), but that shouldn't prevent us from supporting archives with multiple top-level directories. I think we can either:

  1. Transparently not cd into anything if there's more than one toplevel directory (so an archive with two toplevel directories a and b will behave exactly as it they were subdirectories of a unique toplevel one top/a and top/b)
  2. Have a dedicated argument for controlling this behavior (and make the error message point towards it)

I personally have a slight preference for the first option because it is "nicer" in that most things work out-of-the box with it. But it might also be a bit confusing because an archive containing only a will behave differently than an archive containing bot a and b.

dixslyf commented 2 years ago

The reason for that is that it's apparently a very common thing to have all your archive contents under one top-level directory (in particular that's what git archive - and github and most hosting sites - do). So the logic in Nix and in nixpkgs's fetchZip is to extract the archive and cd into that unique directory.

Ahh, that indeed makes sense.

I do personally prefer the first option as well for the same reason, but I think the second would be preferable as a solution. Confusing behaviour isn't ideal. Besides, after a bit of digging, I see that there is already a stripRoot option for fetchzip that does exactly what we want, which unfortunately doesn't exist for fetchTarball. It would be good to add the same option to fetchTarball for consistency, though it may be more difficult to implement because it's a built-in and because we would need to add a way to set the option through the flake input attrs.

egasimus commented 1 year ago

This just bit me when trying to package https://github.com/vector-im/hydrogen-web/releases/download/v0.3.8/hydrogen-web-0.3.8.tar.gz which funnily enough appears in file-roller with 1 top-level dir called .

Oh well. fetchurl it is, then...

fgaz commented 1 year ago

I implemented (1) in #9053

fgaz commented 2 months ago

I think this was fixed in #11195