golift / xtractr

Go Library for Queuing and Extracting Archives: Rar, Zip, 7zip, Gz, Tar, Tgz, Bz2, Tbz2
https://golift.io/discord
MIT License
30 stars 11 forks source link

[Feature request] Provide a "test archive" api #62

Open sagan opened 3 months ago

sagan commented 3 months ago

Hi

I found this library is very handy and easy to use. Thank you for your work.

I think current APIs are too simple, lacking some notable features, like "test archive" and "list archive".

Can you provide a new "test archive" api, that test the integrity, as well as password correctness for encrypted archive, of the archive file, without actually extracting data to disk ?

davidnewhall commented 3 months ago

Hello and thank you for opening a feature request.

Unfortunately, the best way to test an archive is to extract it. I'm not familiar with any other methods. It would also be a lot of work to write "test archive" frameworks for the now-dozen+ archive formats this library supports. My suggestion for testing is to extract into a ram disk (for speed). If the extraction finishes, the archive is valid.

There's a reason you don't see any library or app that tests archives or passwords (as a feature), except maybe 7zip.

As for testing a password, the only two archive types that support it are 7zip and rar. As far as I know, you must read the entire archive to test the password. In both cases. You can do this without writing to disk, so it may be faster than a full extraction, but if you're reading the whole archive, why not extract it?

I'm not sure there's a quick solution for the feature you want. These file types just don't do that very well.

You're welcome to throw some code together to give me a demonstration of what you're looking for. Perhaps we can implement it.

sagan commented 3 months ago

Thank you for replying.

I am ok with test the archive by "extracting" it. However, I don't want to write any temporary file to disk. I think ram disk or something alike is not suitable because: 1. These solutions will need FUSE, which is very heavy, and not portable. Some Linux machines may not have FUSE kernel module included. And on Windows you have to install winfsp separately to use FUSE. 2. You need a ram disk (and memory) with enough space larger than archive file, which is not possible in some cases. 3. I don't find any decent pure-go ram disk library.

I think the simplest way is to change the API of xtractr to accept an empty "OutputDir", in this case , instead of extracting file to disk, it just read file of archive and discard the data, and report any error. For example:

func (x *XFile) unzip(zipFile *zip.File) (int64, error) {
    // ... omitted
    if x.OutputDir == "" {
        return io.Copy(io.Discard, zFile)
    }
    s, err := writeFile(wfile, zFile, x.FileMode, x.DirMode)
    // ... omitted
}