sk-zk / Extractor

cross-platform .scs extractor
GNU General Public License v2.0
20 stars 2 forks source link

strange format scs file #2

Closed CrVault closed 9 months ago

CrVault commented 9 months ago

Hello, I would like to start by thanking you for creating the tool. I have a scs file with a strange hexadecimal, it is causing some problems with extraction, especially in raw mode. https://www.mediafire.com/file/wyxh8vqiwpjdvqu/Scania_file.scs/file

If you can take a look. Thank you very much in advance

sk-zk commented 9 months ago

There are four entries in the archive with their offset set to 0, causing a crash when trying to decompress the bytes at that location. This is now fixed.

That's not the only issue with the archive though; it claims to contain lots of entries with the same offset:

0: 4x 8596: 337x 8706: 337x 358181: 45x 48844304: 15x

Only the first entry for any of these offsets (except for 0, obviously) describes the actual file, and the rest are junk that can be ignored.

With these two fixes, the archive can now be extracted in raw mode.

CrVault commented 9 months ago

I'm having an inconsistency when extracting the files. I have entries that have additional characters, I believe it could be something in this type of file. image

From what I've seen, this also modifies internal files in the scs file and causes errors when loading the mod. I tried replacing all the entries using a hex editor but it doesn't seem to work for all file types.

When it is readable:â¡ in hexa: E2 81 A1

edit: Apparently some files are not found by the extractor either (I did some tests on the mod image file, it was not extracted with the name described in the manifest file)

sk-zk commented 9 months ago

I didn't get what you were talking about at first, but I know what's going on here now.

Let's take one of the paths from your screenshot, minus the very suspicious whitespace: /vehicle/truck/ntg_lgmods/anim_ext/win_open_ori.pmg

If you look up the hash of this string in the archive, you will find one of the junk entries with an invalid offset. However, as you've noticed, the .sii files contain some extra bytes in their paths. What are those? Well, let's open the .sii file where the associated .pmd of the file above is referenced, and paste that string into a Unicode analyzer:

As you can see, the extra bytes are the Unicode character U+2061 (0x080D), which is invisible in most fonts. This is not a bug: the directories are in fact called vehicle\u080D etc. as an obfuscation technique. If you call the extractor with this path, as UTF-8 (which it supports now - I thought paths had to be ASCII, whoops), it will extract the actual file instead of the decoy. This should allow you to unpack the mod.


I did some tests on the mod image file, it was not extracted with the name described in the manifest file

I can't replicate this. The only image referenced in manifest.sii is new.jpg, which extracts correctly