Closed damccull closed 1 year ago
Would you be comfortable sending over the p4k file you're using so I could have a quick look? Obviously don't bother if it's got anything personal in it.
Having a good look as well there's not really a clear answer. It's definitely using the ZIP format so it should be possible, but it's not clear if the encryption is on the whole archive or just individual files. If it is just individual files, you shouldn't be getting that specific unexpected header error so it could equally be a bug.
Howdy. The Data.p4k in question is from the Star Citizen game client. I've uploaded a copy to share a link to OneDrive since it's about 77GB (huge!). I wasn't sure github could handle it. Obviously if you have SC you can just look at it directly, but I'll send the link as soon as the upload is done. I'll be planning to remove it after you get a copy since I'm not sure about public distribution beyond CIG's client.
@Majored Here's the link to my problematic Data.p4k, https://1drv.ms/u/s!AidfnSaORZHWnP0hDue-xEeMGYB_wQ?e=d6Y01s
Again, no rush or expectations, but it would be cool for your lib to open this. Or at least for me to learn a bit about it.
All downloaded, thanks.
Will look into it in a few hours.
Some minimal forensics I've tried to do:
Reading the cryengine docs on crypak, which I believe the p4k file is, this should just be a standard zip file with its contents stored in deflate, zstd, or STORED, and sometimes encrypted with crypak standard key.
However, when I use several freely available zip programs to view or analyze the file, I get one of the following:
with p4k extension, it opens and displays 1 folder and 5 files, all related to a starfarer crash
with .zip extension it fails to open entirely
examining with 7z t
as a p4k file it claims the type is zip but that it has extra data after the end of the archive
examining with the same command as zip extension claims it is not an archive
The zipdetails
command on linux returns the following with either extension:
0000000000 00FFFFFFFF ... PREFIX DATA
Unexpecded END at offset FFFFFFFF, value 99DF8764
Done
Looking at the file in HxD shows offset FFFFFFFF at around 5-10% of the total file.
It's using the ZIP64 spec extension so that would explain the 0xFFFF....'s, since the spec sets the unused fields in the end of central directory header to that. So the first step would be supporting ZIP64 in the first place for this crate, which doesn't seem too bad honestly.
The one thing I am confused about with that p4k file is why the local file headers are using 0x14034B50 as the signature and not 0x04034B50. I assume it's to signify the difference between a normal vs zip64 header, but I can't see any mention of it in the spec - so not sure if that's specific to p4k files or not.
I don't suppose you'd mind explaining how you figured out it's zip64? :D I'm new to zip file internals and am just figuring it out.
I'm not sure about the signature, to be honest. And, please, don't let other work on your library suffer for my questions. I'm patient, and what I want to do with this is not a huge priority as some C# tools already exist to open this.
I do want to eventually open it in rust, but low priority. I suppose if I really want to learn about zip, I need to begin by reading the spec. I'll get on that.
Just from looking at the end of the file and noticing that there were two PK zip signatures when a normal ZIP would only have one (the end of central directory header): https://gyazo.com/8ff274844a672fcb368bfb0466ebad67
Searching the signature up (converted from little to big endianness), the first one relates to the zip64 end of central directory locator. https://github.com/Majored/rs-async-zip/blob/main/SPECIFICATION.md#4315
And you're all good regarding questions. I'm happy to help plus I definitely want to support ZIP64 down the line so doing a bit of digging now doesn't hurt.
Holy cow. Your markdown version is so much easier to read than the official pkware APPNOTE.txt file. I'mma use this as my learning reference. Thanks for all the info and help. If I can figure out the spec in my brain maybe I'll help with a PR or two. I'm kind of an amateur though so I'm not sure it'll be anything impressive :)
Just want to note that I've started working on implementing Zip64 reading and writing.
I'll go ahead and close this as a majority of the Zip64 implementation has been added. I assume there might be a few rough areas or improvements that can be made so it'll be best to open any separate issues for those.
Thanks again for your work on this @skairunner.
Thanks for working on this! Sorry it's taken me so long to see this and reply. I've been offline for a while with life stuff going on. I'll open up my project again and see if this gets me where I need to be, and then I'll actively appreciate the work you both did.
So, I realize this isn't in the realm of responsibility for a zip archive library, but I am trying to open a p4k file (from star citizen) using your lib and getting an unexpected header. My research indicates (even from cryengine's own developer docs) that a crypak/p4k file should just be a zip with some zstd or deflate compression and some files encrypted while others are not even compressed. From what I have been able to find, I should be able to just list the files with standard zip capability, and yet I am getting this unexpected header before I even try to extract.
Looking at some other open source code that successfully opens the file in dotnet and another in python, those authors have some kind of encryption key they seem to open it with. Maybe this is happening because async_zip doesn't support encryption yet? I'm trying to find out what kind of encryption it is. Maybe ZipCrypt, but I'm not sure.
Obviously this isn't something I expect you to solve, but I wondered if you have any insight into why this might happen. If not, that is ok too. Thanks!