mw99 / DataCompression

Swift libcompression wrapper as an extension for the Data type (GZIP, ZLIB, LZFSE, LZMA, LZ4, deflate, RFC-1950, RFC-1951, RFC-1952)
Apache License 2.0
286 stars 57 forks source link

Unable to unzip large files zip on MacOS command line. #2

Closed garygriswold closed 7 years ago

garygriswold commented 7 years ago

Your DataCompression.swift is exactly the kind of utility that I need to unzip downloaded files in a mobile App. The files are sqlite database files that are about 6MB compressed and 22MB uncompressed. I have tested that each of your compression algorithms will compress and decompress this data correctly. However, when I compress a file with MacOS command-line zip.

$ zip myFile.zip myFile

None of your decompression algorithms is able to decompress the 'myFile.zip'.

.inflate() and .ZLIB return the error 'Invalid BTYPE 3 block'. All of the others return no error. They all output a nil output Data.

Are there any settings that can be tweaked to solve that problem? Thanks so much for any help you can provide. I would really like to be able to use your code.

Below is the sample code that I have used to test this:

private func zip_unzip(sourceFile: String, targetFile: String, doZip: Bool) -> Void { do { let sourceUrl = URL(fileURLWithPath: NSHomeDirectory() + sourceFile) print("source URL (sourceUrl)") let source = try Data(contentsOf: sourceUrl) print("source Data (source)") let target: Data? = (doZip) ? source.compress(withAlgorithm: .ZLIB) : source.decompress(withAlgorithm: .ZLIB) print("target Data (String(describing: target))") let targetUrl = URL(fileURLWithPath: NSHomeDirectory() + targetFile) print("target url (targetUrl)") try target?.write(to: targetUrl, options: Data.WritingOptions.atomic) } catch { print("caught exception did not complete zip/unzip") } }

mw99 commented 7 years ago

Hello,

thank you for message. Always nice to hear from someone who uses your software.

The problem is that the macOS command line tool 'zip' does produce a file in the the PKZIP format. Or commonly called the classic ".zip file". That is actually a quite complex format to bundle a lot of files into one, while also compressing them. Internally PKZIP uses the deflate algorithm to compress the files, so it is quite easy to get confused here.

https://en.wikipedia.org/wiki/PKZIP

So you want to create files on the command line, that DataCompression can decompress? No problem. One way would be to:

$ wget http://www.zlib.net/zpipe.c
$ gcc -lz zpipe.c -o zpipe
$ cat your_sqlite_file | ./zpipe >  sqlite_db.deflated

You can then read 'sqlite_db.deflated' into a Data type and call .unzip() on it to decompress it.

Other possibilities to create deflated files can be found here: https://stackoverflow.com/questions/3178566/deflate-command-line-tool

Alternatively you could install the lzma command line tool, with $ brew install lzma And use that algorithm.

Or you could try to install the LZFSE command line tool from: https://github.com/lzfse/lzfse And use that algorithm.

Good luck.

Keep in mind that DataCompression needs to load the whole file into memory before decompressing it, that means that at some point in time, the compressed and decompressed file have to be both in the memory. So if your files exceed say 50 MB and you get memory warnings, you should look for a stream compression solution.

Best regards,

Markus Wanke

garygriswold commented 7 years ago

Markus,

Thanks for these very helpful remarks.