adamhathcock / sharpcompress

SharpCompress is a fully managed C# library to deal with many compression types and formats.
MIT License
2.29k stars 482 forks source link

compatibility with non-standard zip formats #863

Closed Qeynos closed 3 months ago

Qeynos commented 3 months ago

I have an old zip file that may not have been created following the standard specifications. When I use SharpCompress to decompress it, the backslash character is merged with the file name, causing the extracted files to have incorrect paths, such as "/home/test/storage\first\error.jpg". I need to use code like this to solve the problem. I wonder if there has been any consideration for compatibility with non-standard zip formats.

    var opts = new ReaderOptions();
    var encoding = Encoding.GetEncoding(936);
    opts.ArchiveEncoding = new ArchiveEncoding
    {
        CustomDecoder = (data, x, y) =>
    {
        return encoding.GetString(data);
    }
    };
    var archive = ZipArchive.Open(archiveFilePath, opts);
    foreach (var entry in archive.Entries.Where(entry => !entry.IsDirectory))
    {
        string? fileName = entry.Key;
        string updatedPath = fileName.Replace('\\', '/');
        string filePath = Path.Combine(outputDirectoryPath, updatedPath);
        if (!Directory.Exists(Path.GetDirectoryName(filePath)))
        {
            Directory.CreateDirectory(Path.GetDirectoryName(filePath));
        }
        entry.WriteToFile(filePath);
    }
adamhathcock commented 3 months ago

the slashes are just the key name and is OS independent really. The Path.Separator should be OS specific and could be used to figure that out but not everyone wants the keys to be automatically converted.

Qeynos commented 3 months ago

In Linux, the 'storage\first\error.jpg' in a non-standard zip file is considered to be a filename without a path. Even if you use Path.Combine to combine entry.Key with the path, it will not correct this error.

adamhathcock commented 3 months ago

Like I said, this is a key that is the same on all platforms with those slashes....a zip is a series of key/values.

The slash being different across platforms is an implementation detail you need to fix yourself or use a helper method to convert them.