Open hanjoosten opened 5 years ago
This could be Windows-specific. I just tried using zip-archive to extract this zip on my (macos) machine, and had no difficulties. Unfortunately, it isn't easy for me to test on Windows. Maybe you can do some poking around and see why this exception is being raised? The relevant code is just
let path = case [d | OptDestination d <- opts] of
(x:_) -> x </> eRelativePath entry
_ -> eRelativePath entry
absPath <- makeAbsolute path
curDir <- getCurrentDirectory
let isUnsafePath = ".." `isInfixOf` absPath ||
not (curDir `isPrefixOf` absPath)
when isUnsafePath $ E.throwIO $ UnsafePath path
So you could do some tracing to see what absPath
and curDir
are in your case.
Ah. This is where the problem is. The problem is in the use of curDir.
In my usecase, I have a destination that has no relation with curDir. So I get isUnsafePath == True for all my entries.
I believe the test to see in an entry has an unsafe path should be done on the eRelativePath of the entry before the destination is prepended to it. If that path is an absolute path OR that path contains ".."
, then it could be mallicious. I do not see the relation with curDir.
Ah yes, curDir is definitely wrong here! Sorry about that.
The test you suggest is the one I had originally suggested in #50. @blender suggested instead
Make the path absolute (resolving symlinks as well) and check that it does not leave the root directory of the archive.
and this was my attempt to do that.
Questions:
@blender I'd be interested in your view on this too, since you raised the original issue.
Well, the good news is this isn't windows specific. Thanks for looking into this!
@jgm my memory is a little rusty but I think the path to check is this one: https://github.com/jgm/zip-archive/blob/master/src/Codec/Archive/Zip.hs#L145
This path (once resolved) must not leave the root of the archive.
If that is sanitized then there shouldn't be any reason to worry about symlinks that anyways cannot leave the root: https://github.com/jgm/zip-archive/blob/master/src/Codec/Archive/Zip.hs#L404
All paths should be resolved relative to the root for the archive when recorded and when extracted.
Even if one zips /usr/local/bin
this should be unzipped to <output-destination>/usr/local/bin
.
So:
eRelativePath
is sanitizednormalizePath
So how about if we just check
not (".." `isInfixOf` eRelativePath entry || isAbsolute (eRelativePath entry))
@jgm I think that would work but unfortunately Hello..ciao
is a valid directory name so the first check would give a false positive.
how about splitting eRelativePath entry
on /
and checking that the list doens't contain ..
? it's till naive as more elaborate paths might actually not leave the root of the archive once properly resolved.
I have looked at canonicalizePath
but I am not sure about this bit:
Canonicalization is performed on the longest prefix of the path that points to an existing file or directory. The remaining portion of the path that does not point to an existing file or directory will still be normalized, but case canonicalization and indirection removal are skipped as they are impossible to do on a nonexistent path.
You could use splitDirectories to split the paths in the archives. Then you should be fine a name like Hello..ciao
.
If you need someone to test your solution on a windows machine, just let me know.
-- Copied from MissingH
subIndex :: Eq a => [a] -> [a] -> Maybe Int
subIndex substr str = findIndex (isPrefixOf substr) (tails str)
-- Actual resolve
resolve path = case subIndex [".."] path of
Just 0 -> undefined -- escapes!
Just index -> resolve $ (take (index-1) path) ++ (drop (index+1) path)
Nothing -> path
Prelude Data.List> resolve ["1", "2", "3", ".."]
["1","2"]
Prelude Data.List> resolve ["1", "2", "3", "..", ".."]
["1"]
Prelude Data.List> resolve ["1", "2", "3", "..", "..", ".."]
[]
Prelude Data.List> resolve ["1", "2", "3", "..", "..", "..", ".."]
*** Exception: Prelude.undefined
CallStack (from HasCallStack):
error, called at libraries/base/GHC/Err.hs:79:14 in base:GHC.Err
undefined, called at <interactive>:205:31 in interactive:Ghci34
Prelude Data.List>
This however doens't cover stuff like ["1", "2", "3", "..", "..", "..", "..", "1"]
which doesn't actually escape
I'd argue that ["1", "2", "3", "..", "..", "..", "..", "1"]
does actually escape. It seems valid to me to throw an error here.
Yes, I think splitDirectories
and check for a ..
component is the way to go here.
Have a look at the change I just pushed.
An alternative approach would be to always append a destination path to the relative path, and if no destination path is specified, use the working directory. This should have the same effect in practice.
Great! Thanks for the fix.
@jgm I think it should work until someone complaints that A/../A
is a valid path in an archive :)
Hm, yes. Well, I wouldn't object to adding some logic to simplify A/../A
to A
etc.
Seems like something simple like this could work:
simplifyPath :: FilePath -> IO FilePath
simplifyPath fp = maybe (throwIO (UnsafePath fp)) return (resolve (splitDirectories fp))
resolve :: [String] -> Maybe [String]
resolve [] = Just []
resolve ("..":_) = Nothing
resolve (".":xs) = resolve xs
resolve (x:"..":xs) = resolve xs
resolve (x:xs) = (x:) <$> resolve xs
This wouldn't allow
"foo" </> "bar" </> ".." </> ".." </> "baz"
while it should.
Maybe not very elegant, but something like this would do the job:
simplifyPath :: FilePath -> IO FilePath
simplifyPath fp = maybe (throwIO (UnsafePath fp)) $ return (resolve (splitDirectories fp))
resolve :: [String] -> Maybe [String]
resolve str = resolve' (elemIndices ".." str') str'
where str' = filter (== ".") str
resolve' [] rest = Just rest
resolve' (0:_) rest = Nothing
resolve' (n:ns) rest = (foo <>) <$> resolve' (elemIndices ".." rest') rest'
where (foo,bar) = splitAt (n-1) rest
rest' = drop 2 bar
I like this one because it's a bit more obvious how it works:
import System.FilePath
import Control.Monad
import Codec.Archive.Zip
import Control.Exception
simplifyPath :: FilePath -> IO FilePath
simplifyPath fp =
maybe (throwIO (UnsafePath fp)) (return . joinPath)
(resolve . splitDirectories $ fp)
resolve :: Monad m => [String] -> m [String]
resolve =
fmap reverse . foldl go (return [])
where
go acc x = do
xs <- acc
case x of
"." -> return xs
".." -> case xs of
[] -> fail "outside of root path"
(_:ys) -> return ys
_ -> return (x:xs)
I've pushed a change, further testing welcome, and we really need to add some automated tests to ensure that the exception is raised when it should be.
I use zip-archive as a library in ampersand. We need to extract the file https://github.com/AmpersandTarski/Prototype/archive/v1.1.1.zip programatticaly. Since we upgraded to zip-archive v0.4, we get an error message:
I think it is totally valid to extract a zipfile that contains only files and directories that are in scope, such as
.bowerrc
I expected that the extraction of the archive would go without any problem. Has something gone wrong with the implementation of Issue #50 ?