snoyberg / tar-conduit

Conduit based tar extraction mechanism
MIT License
8 stars 9 forks source link

Hard links should be handled #19

Closed gaborh-da closed 5 years ago

gaborh-da commented 6 years ago

Hard links cause immediate termination of the program using the library.

  1. I do not think error should be used (maybe http://hackage.haskell.org/package/base-4.11.1.0/docs/System-IO-Error.html#t:IOError would be a more appropriate choice, it can be handled, at least, by the user)
  2. Hard links could be handled somehow
    • like symlinks maybe?
    • a minimum would be to have the file the link is pointing at in FTHardLink constructor (like for symlinks)
lehins commented 6 years ago

@gaborh-da This issue is twofold:

  1. Addition of support for hard links would be nice and I assume it work similarly to symbolic links. But I am not sure how soon I'd be able to get to it. So a PR would be welcome if you'd like to take a crack at it.
  2. Trying to restore file types from tar archive that are currently unsupported will cause an exception. Avoiding that issue is very simple and I added a helper function extractTarballLenient that ignores any possible errors in d633ed9492f5552ad8cce836fa3311bfeee320e4 Let me know how it works out for you and I'll make a release. In case you want some custom error handling it's not that hard, all of the building blocks are already available.
twpayne-da commented 6 years ago

In its current state, the tar-conduit library aborts the entire process if it encounters a tar file with hard links. As tar files are usually user-supplied input, this means any application or process using the tar-conduit library is trivially vulnerable to a denial-of-service attack.

extractTarballLenient silently ignores errors, which is not suitable for production code. Errors should be collected and properly fed back to the caller.

lehins commented 6 years ago

@twpayne-da tar-conduit throwing an exception is not synonymous with "aborting the entire process", nothing prevents you from wrapping a call to any of the provided functions with tryAny or catchAny, which would yield you the reason for failure as well as prevent your whole process from dying. Moreover because "tar files are usually user-supplied input" it is absolutely natural to expect an exception. Therefore this statement is absurd:

tar-conduit trivially vulnerable to a denial-of-service attack

At the same time your suggestion of collecting errors in extractTarballLenient seems fair enough, so I added a slightly modified version that just that. It is currently available in a separate branch: extractTaballLenient. Let me know what you think of that approach.

lehins commented 5 years ago

Restoring Hard Links has been implemented in #20 and lenient restoring was added in #23

Features in regards of hard links that are still missing: