adamhathcock / sharpcompress

SharpCompress is a fully managed C# library to deal with many compression types and formats.
MIT License
2.28k stars 480 forks source link

Filename Encoding issue with Tar #754

Open DisIsAbhi opened 1 year ago

DisIsAbhi commented 1 year ago

Hello, Thanks for working on this repo. Really useful. I have tried to download the Tar file from your Test Archives.

image

If i use var archive= TarArchive.Open(inputFile) and foreach (var entry in archive.Entries.Where(entry => !entry.IsDirectory)) { entry.WriteToDirectory(outputFolder, new ExtractionOptions() { ExtractFullPath = true, Overwrite = true, PreserveFileTime = true })); }

the output looks like this

image

I have tried ASCII, UNICODE and UTF8 and none of them seem to work.

This begs the question, when opening an archive, how do we make sure that a right ArchiveEncoding is being used? When we try to open the tar file using 7Zip file manager it is able to display the filename correctly. so if there is anything i am missing with the code here, please let me know.

adamhathcock commented 1 year ago

Zip file encoding detection needs improving. Setting it manually seems to be a workaround but not great.

DisIsAbhi commented 1 year ago

Thanks. Surprisingly, i have tried the Dotnet7 implementation of tarreader (thanks for contributing to that as well. I have seen your comments)

FileStream archiveStream = File.Open(inputFile, FileMode.Open, FileAccess.Read);

            using (TarReader reader = new(archiveStream, leaveOpen: true))
            {
                TarEntry? entry;
                while ((entry = reader.GetNextEntry()) != null)
                {
                    if (entry.EntryType != TarEntryType.Directory)
                    {
                        var fname = Path.Join(outputFolder, entry.Name);
                        var dname = Path.GetDirectoryName(fname);
                        if (!Directory.Exists(dname)) Directory.CreateDirectory(dname);
                        Log.Information($"Entry name: {entry.Name}, entry type: {entry.EntryType}");
                        entry.ExtractToFile(destinationFileName: fname, overwrite: false);
                    }
                }
            } 

            This also seems to give me the same output as Sharpcompress. the same set of files in other compression formats in your archive test folder seem to work. 

This is what i get from 7Zip.BZip2.7z archive you have provided. Thanks image

DisIsAbhi commented 1 year ago

Also your repo seems to have filenamedecoder for rar files. do you think it may be useful here?