commercialhaskell / stack

The Haskell Tool Stack
http://haskellstack.org
BSD 3-Clause "New" or "Revised" License
3.98k stars 845 forks source link

Git dependencies fail when they contain symlinks to outside the repository, or broken symlinks #4913

Open michaelpj opened 5 years ago

michaelpj commented 5 years ago

General summary/comments (optional)

We have https://github.com/shmish111/servant-purescript as a git dependency. New versions of Stack fail with a tarball error.

It seems there have been some issues in this vein already, e.g. https://github.com/commercialhaskell/stack/issues/4580.

Steps to reproduce

Add

extra-deps:
- git: https://github.com/shmish111/servant-purescript.git
  commit: ab14502279c92084f06aa6222a17873275279e63

to your stack.yaml and stack build.

Expected

It works.

Actual

2019-06-27 10:12:59.579419: [info] Cloning ab14502279c92084f06aa6222a17873275279e63 from https://github.com/shmish111/servant-purescript.git
2019-06-27 10:12:59.579589: [debug] Run process within /tmp/michael/with-repo11032: /run/current-system/sw/bin/git clone https://github.com/shmish111/servant-purescript.git cloned
2019-06-27 10:13:00.705248: [debug] Process finished in 1126ms: /run/current-system/sw/bin/git clone https://github.com/shmish111/servant-purescript.git cloned
2019-06-27 10:13:00.705527: [debug] Run process within /tmp/michael/with-repo11032/cloned: /run/current-system/sw/bin/git reset --hard ab14502279c92084f06aa6222a17873275279e63
2019-06-27 10:13:00.710885: [debug] Process finished in 5ms: /run/current-system/sw/bin/git reset --hard ab14502279c92084f06aa6222a17873275279e63
2019-06-27 10:13:00.711107: [debug] Run process within /tmp/michael/with-repo11032/cloned: /run/current-system/sw/bin/git submodule update --init --recursive
2019-06-27 10:13:00.749071: [debug] Process finished in 38ms: /run/current-system/sw/bin/git submodule update --init --recursive
2019-06-27 10:13:00.749315: [debug] Run process within /tmp/michael/with-repo11032/cloned: /run/current-system/sw/bin/git -c core.autocrlf=false archive -o /tmp/michael/with-repo-archive11032/foo.tar HEAD
2019-06-27 10:13:00.752674: [debug] Process finished in 3ms: /run/current-system/sw/bin/git -c core.autocrlf=false archive -o /tmp/michael/with-repo-archive11032/foo.tar HEAD
2019-06-27 10:13:00.752884: [debug] Run process within /tmp/michael/with-repo11032/cloned: /run/current-system/sw/bin/git submodule foreach --recursive "git -c core.autocrlf=false archive --prefix=$displaypath/ -o bar.tar HEAD && if [ -f bar.tar ]; then tar --force-local -Af /tmp/michael/with-repo-archive11032/foo.tar bar.tar ; fi"
2019-06-27 10:13:00.788258: [debug] Process finished in 35ms: /run/current-system/sw/bin/git submodule foreach --recursive "git -c core.autocrlf=false archive --prefix=$displaypath/ -o bar.tar HEAD && if [ -f bar.tar ]; then tar --force-local -Af /tmp/michael/with-repo-archive11032/foo.tar bar.tar ; fi"
2019-06-27 10:13:00.793175: [debug] parseArchive of GZIP-ed tar file: ZlibException (-3)
2019-06-27 10:13:00.794082: [error] Unsupported tarball from /tmp/michael/with-repo-archive11032/foo.tar: File located at "examples/central-counter/frontend/setupPath.sh" is a symbolic link to absolute path /home/robert/projects/gonimo-front/setupPath.sh

Inspecting the final error reveals a weirdness - that repository has a checked-in symlink to an absolute path on some person's computer!

Now, this is not a good idea, but it's still a valid Git repository and it used to work just fine. So I think this is a bug.

Stack version

2.1.1.1 x86_64 hpack-0.31.2

Method of installation

Nix.

michaelpj commented 5 years ago

Recently been hit by this again. It seems that it also breaks with:

mattaudesse commented 5 years ago

Thanks @michaelpj.

You might've already seen this related thread that explores a similar problem.

Having glanced through that discussion it sounds like we're deliberately leaning toward keeping such exceptions because it's easier to ensure deterministic package hashes that work reliably across platforms, but I've added the "discuss" label to this issue nonetheless - maybe there's some happy middle ground to be found.

michaelpj commented 5 years ago

I hadn't seen that thread.


Firstly: I've now observed this in cases with symlinks that aren't broken, too. I think that really should work. Here's some logs:

Unsupported tarball from https://github.com/input-output-hk/plutus/archive/dbc2646112a87c710a8dd6f80b63458ca37a06c4.tar.gz: Symbolic link dest not found from plutus-dbc2646112a87c710a8dd6f80b63458ca37a06c4/plutus-book/src/Game/Guess.lhs to ../../doc/game/guess.adoc, looking for plutus-dbc2646112a87c710a8dd6f80b63458ca37a06c4/plutus-book/src/../doc/game/guess.adoc.

It looks like only one layer of .. is being normalized away. I can open a separate issue for this, perhaps?


Secondly, I can see the reasons for doing what you're doing with broken symlinks, but it has some costs. Packages with broken symlinks will:

This means we're set up for repeated issues where things work everywhere else... and then suddenly don't in stack. And this will only be noticed by downstream users of a package, even if the package author uses stack they won't notice!

michaelpj commented 5 years ago

I made https://github.com/commercialhaskell/stack/issues/5004 for the relative symlink case which I think is just a bug. This one can continue to be about broken symlinks, which I think is a more ambiguous case (although I still think it should work).

mattaudesse commented 5 years ago

Well-spotted and thanks again for your reports @michaelpj.

I'm going to CC @qrilka @snoyberg and @dbaynard since I believe they're all much better informed on this subject than I am and probably in a better position to help.

snoyberg commented 5 years ago

I'm not particularly inclined to make any changes based on what I'm seeing in this issue. The initial claim:

Now, this is not a good idea, but it's still a valid Git repository and it used to work just fine. So I think this is a bug.

Sure, lots of things are valid Git repository or tarballs. There are many such cases we don't support. Simple example: every build tool requires a .cabal file. The standard for "correctly defined Haskell package" is significantly higher than "Git will allow you to create it."

michaelpj commented 5 years ago

Well, it's not just that it's a valid Git repository. You can also upload it to Hackage just fine or use it with source-repository-package. For most other tools broken symlinks that aren't used in the build are just irrelevant. So I think it's fair to say that Stack's criterion of "correctly defined Haskell package" is here is different to everybody else's.

I don't think this is a huge deal (whereas I do care about https://github.com/commercialhaskell/stack/issues/5004), but I do think this is going to bite people. And there's no downstream workaround short of forking the upstream repository.

dbaynard commented 5 years ago

I've used the export-ignore workaround in the .gitattributes file, but it's a bit of a pain.

It does seem problematic that stack will build the git repository unless it is included as a dependency.

On a more practical note, this issue is now with commercialhaskell/pantry, isn't it?

https://github.com/commercialhaskell/pantry/blob/6ddb4438c50bcccbca2f8c6caabc0520c9e4ab00/src/Pantry/Archive.hs#L365-L370

Could the symbolic link path checking be redundant, here? (That said, resolving non-tarball paths from tarballs seems like a recipe for a later breakage, with potential security implications, too).

phracek commented 2 years ago

Do you have any idea, where can be a problem with this symbolic link? https://github.com/sclorg/s2i-perl-container/blob/master/5.26/test/sample-test-app.json