antelle / node-stream-zip

node.js library for fast reading of large ZIPs
Other
447 stars 63 forks source link

Cannot Extract Folders From Archive #62

Closed Unknown025 closed 4 years ago

Unknown025 commented 4 years ago

Hello there!

I've recently started using this library to download and extract zip archives. However, one problem that I came across, I'm encountering an "ENOENT: no such file or directory, mkdir [path]" error when extracting a specific archive. I can provide the specific zip file if needed, but it looks similar to this:

mcheli.zip

My current code looks like so: Gist. Anything I'm doing wrong?

antelle commented 4 years ago

Hi! Do you have a stack trace of the error?

Unknown025 commented 4 years ago

Yes, it throws on line 14. { "errno": -4058, "code": "ENOENT", "syscall": "mkdir", "path": "mcheli\\assets\\mcheli" } The top level folder that I'm trying to extract to definitely exists, but the folder structure from inside the archive does not.

antelle commented 4 years ago

I mean, inside, where does it throw? It should be there in the stack trace.

Unknown025 commented 4 years ago

Ah, okay. It's line 563 in node_stream_zip.js, in the callback from createDirectories(outPath, dirs, function (err))...

antelle commented 4 years ago

Strange, if you can make a minimal example of such archive with a script that fails, it would help.

Unknown025 commented 4 years ago

Interesting, I've been trying to recreate an archive that would trigger the same exception, but even the same archive created through 7-Zip doesn't cause the problem. Normally, I create the archive using C#'s ZipFile.CreateFromDirectory() method, so I wonder if that has something to do with it. By default, it seems to use UTF-8, but using ASCII encoding seems to mess up some of the filenames, and yields the same error. I'll do some more digging to see if I can find out anything else.

antelle commented 4 years ago

What do you see if you do

zip -sf your_archive

or

unzip -l your_archive

on that zip?

Unknown025 commented 4 years ago

zip -sf: Total 3025 entries (435478603 bytes) unzip -l 435478603 3025 files

antelle commented 4 years ago

Well I meant file names, especially directories that it contains. Probably it's too much if it's >3k entries... I just tried to create an archive like this:

unzip -l deep.zip      
Archive:  deep.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
        0  10-18-2020 10:36   dir/inside/deep/
        7  10-18-2020 10:36   dir/inside/deep/file.txt
---------                     -------
        7                     2 files

so that it doesn't contain entries for dir/ and dir/inside/ as it does usually, and it also works for me. I was thinking maybe it's something like missing parent directories, but that's something else.

Unknown025 commented 4 years ago

My bad, here is the full output.

antelle commented 4 years ago

Thanks, I know why this happens, it's because of slashes. I wonder if it's correct, to write slashes like this into a zip archive. If I create a zip like this, it's extracted as one file with slash in its name on macOS, and I assume on Linux. Do other utilities work well with such zips on Windows?

antelle commented 4 years ago

See also: https://docs.microsoft.com/en-us/dotnet/framework/migration-guide/mitigation-ziparchiveentry-fullname-path-separator

Unknown025 commented 4 years ago

Looks like that's what I get for coasting on old .NET versions. Thanks, that seems to have fixed the problem entirely. Other utilities must be used to Windows using forward slashes for everything, since 7-Zip and Windows Explorer both worked fine. Targeting .NET 4.6.2 created an archive that extracts properly as far as I can see. Would've probably never figured it out myself, so once again, thanks for the help, I appreciate it.

antelle commented 4 years ago

You're welcome! It's interesting that it's the first time this is reported, so I assume all other utilities have been writing forward slashes for a while, apart from old .NET builds. It's quite easy to add a flag for this that would replace slashes but I assume we don't need it already.