microsoft / RecursiveExtractor

RecursiveExtractor is a .NET Standard 2.0 archive extraction Library, and Command Line Tool which can process 7zip, ar, bzip2, deb, gzip, iso, rar, tar, vhd, vhdx, vmdk, wim, xzip, and zip archives and any nested combination of the supported formats.
MIT License
192 stars 29 forks source link

Some ISOs not parsed from DiscUtils #9

Open PrzemyslawKlys opened 4 years ago

PrzemyslawKlys commented 4 years ago

I've been playing with this library for a few minutes and created small PowerShell module using it.

$ExtractMe = "$Env:UserProfile\Downloads\DaRT70.iso"
$Extractor = [Microsoft.CST.OpenSource.RecursiveExtractor.Extractor]::new()
$Extractor.ExtractFile($ExtractMe) | Format-Table

this is the error: image

This is bootable .iso.

gfs commented 4 years ago

Thanks for your report. That its unusual since it looks like you are on Windows.

Can you possibly provide the target file you're using?

PrzemyslawKlys commented 4 years ago

Unfortunately can't. This is from Dart from MS, but it's a private build.

But it would seem.ISO files aren't working properly:

image

Tar files behave differently

image

And also I am not sure if FullPath is what I would expect - why would there be ; and why would it reverse slash for files inside .tar? I guess it sees it like that inside .tar but maybe it should normalize it for Windows since I'm on Windows?

gfs commented 4 years ago

This is from Dart from MS, but it's a private build.

You can send it to my Microsoft email/teams if that's possible. I have a .iso unit test that is passing so its harder to nail down without a repro file. I can try to grab a couple bootable isos randomly and try to repro however.

why would there be ;

This is a bug that is fixed in main but the builds aren't being published currently. We transitioned to a new repo and are still working on getting the pipeline up here.

why would it reverse slash for files inside .tar?

This is something I'll need to address. But basically its because most non-windows things naturally use / paths.

PrzemyslawKlys commented 4 years ago

Please give me your address - I'll send it over.

Another question - is the library simply to "show" files within archive or is it supposed to extract them as well on demand?

gfs commented 4 years ago

It extracts the files. The FileEntry objects returned contains a Content stream that contains the contents of the file.

PrzemyslawKlys commented 4 years ago

Sent.

gfs commented 4 years ago

One of these mentioned files seems to be fixed in main.
The other two samples return no results when passed to DiscUtils (our ISO parsing dependency). That appears to be a bug in the DiscUtils library. Need to investigate finding a public .ISO that reproduces the 0 result issue.

gfs commented 4 years ago

Current issue is that some ISOs produce no enumerated files

gfs commented 2 years ago

When looking through issues on the DiskUtils repo I noticed this comment that some isos that don't parse correctly may be UDF formatted. DiskUtils also has a UDF reader so it might be worth retrying isos as UDF.

https://github.com/DiscUtils/DiscUtils/issues/210

SteAmeR commented 1 year ago

I apologize for raising the matter again, but the UDF issue still persists so I tried to fix it. I hope it helps.

gfs commented 1 year ago

Thanks for your help @SteAmeR with #120, which added an extractor for UDF isos without a joliet index.

@PrzemyslawKlys is it possible for you to retry with the file you were having issues with on the latest or to reach back out to provide the sample file again or perhaps explain how to generate a file of this type with test contents suitable?