Open flying-sheep opened 10 years ago
I imagine getting the filepath is related to the first (or last) 32 bits of each file record, but reading them in any way appears to just give garbage data.
I think a good first step is to attempt to "decrypt" the whole file using one of the record's keys (or perhaps there's a key in the file header somewhere?) and write it to disk, and see what kind of strings we can get out of it.
Decompiling the file path hash function right now.
Try to comment on the code you make! I think this is awesome to learn about.
Here is the code. Going to maybe try hashing all the strings in isaac's binary and see what comes up.
/* Original Asm (Base address 0x00FA0000)
0111B8B0 /$ 56 PUSH ESI
0111B8B1 |. 8BF0 MOV ESI, EAX
0111B8B3 |. 8A0E MOV CL, [ESI]
0111B8B5 |. B8 05150000 MOV EAX, 1505
0111B8BA |. 84C9 TEST CL, CL
0111B8BC |. 74 27 JE SHORT 0111B8E5
0111B8BE |. 8BFF MOV EDI, EDI
0111B8C0 |> 8D51 BF /LEA EDX, [ECX-41]
0111B8C3 |. 46 |INC ESI
0111B8C4 |. 80FA 19 |CMP DL, 19
0111B8C7 |. 77 03 |JA SHORT 0111B8CC
0111B8C9 |. 80C1 20 |ADD CL, 20
0111B8CC |> 80F9 5C |CMP CL, 5C
0111B8CF |. 75 02 |JNZ SHORT 0111B8D3
0111B8D1 |. B1 2F |MOV CL, 2F
0111B8D3 |> 8BD0 |MOV EDX, EAX
0111B8D5 |. C1E2 05 |SHL EDX, 5
0111B8D8 |. 03D0 |ADD EDX, EAX
0111B8DA |. 0FB6C1 |MOVZX EAX, CL
0111B8DD |. 8A0E |MOV CL, [ESI]
0111B8DF |. 03C2 |ADD EAX, EDX
0111B8E1 |. 84C9 |TEST CL, CL
0111B8E3 |.^ 75 DB \JNZ SHORT 0111B8C0
0111B8E5 |> 5E POP ESI ; 00AE74F8
0111B8E6 \. C3 RETN
*/
//Isaac &s the result with 0x7FF and uses that number for an array lookup to get the file's data.
static uint Hash1(string str)
{
uint ret = 0x1505;
for(int i = 0; i < str.Length; i++)
{
byte c = (byte)str[i];
if ((byte)(c - 0x41) <= 0x19)
c += 0x20;
if (c == 0x5C)
c = 0x2F;
ret = (uint)(((ret << 5) + ret) + c);
}
return ret;
}
//This is the second checksum.
//This is used to make sure the first hash pointed to the right place.
//If not isaac will search all records for the 2 correct hashes.
static uint Hash2(string str)
{
uint ret = 0x5BB2220E;
for(int i = 0; i < str.Length; i++)
{
byte c = (byte)str[i];
if ((byte)(c - 0x41) <= 0x19)
c += 0x20;
if (c == 0x5C)
c = 0x2F;
ret = (c ^ ret) * 0x1000193;
}
return ret;
}
Quick test and it works quite well. Of course many of the graphics didn't get renamed. Going to have to do something about getting names from xml files. I think I may change this to dump by hex into of an incrementing number. Then create a second step that gathers strings from xml files and the binary and tries to repair names.
dump by hex into of an incrementing number
what do you mean? why not simply load all files of config.a into memory, then
if it’s XML, look at the root node. use a manually created table to map root node name to an attr name and xpaths to reconstruct resource paths.
e.g. <players />
has three root attributes: root, portraitroot, and bigportraitroot. reconstructing the paths would be made from something like this:
{
"players": {
"root": [ "player/@name", "player/hair/@gfx" ],
"portraitroot": [ "player/@portrait" ],
"bigportraitroot": [ "player/@bigportrait" ]
},
...
}
we’d consume that, and use it on that XML file where it works on (<players>
root node) by reading the attribute players/@root
, and concatenating that prefix with each of players/player/@name
and players/player/hair/@gfx
.
as soon as we have created all those file names, we read all .a files, not only config.a
and use the file name hashes to name them with the file names we constructed.
so, here it is, but some things to note:
<pocketitems>
has pocketitems/card/hud
named like 00_TheFool
: those are possibly also file names…<preloads>
has no kind of root so i used "" as the key. the file names are specified in full as preloads/preload/png/@path
stages
<bosses>
and <nightmares>
each have an attribute that is itself a file path.<fxLayers>
has two “root” nodes, the second of which has no resource root. i used "fxLayers.gfxroot"
to signify it uses another node’s path root instead of an own{
"achievements": {
"gfxroot": [ "achievement/@gfx" ]
},
"babies": {
"root": [ "baby/@skin" ]
},
"backdrops": {
"gfxroot": [ "backdrop/@gfx" ]
},
"bosses": {
"root": [ "boss/@portrait", "@anm2" ]
},
"costumes": {
"anm2root": [ "costume/@anm2path" ]
},
"cutscenes": {
"root": [ "cutscene/anm2part/@anm2", "cutscene/videopart/@file" ]
},
"fxLayers": {
"gfxroot": [ "fx/@path", "fx/gfx/@path" ]
},
"fxRays": {
"fxLayers.gfxroot": [ "rayGroup/fxRay/@path" ]
},
"giantbook": {
"anm2root": [ "entry/@anm2", "entry/@gfx" ]
},
"items": {
"gfxroot": [ "passive/@gfx", "active/@gfx", "familiar/@gfx", "trinket/@gfx" ]
},
"music": {
"root": [ "track/@intro", "track/@path", "track/@layerintro", "track/@layer" ]
},
"nightmares": {
"root": [ "nightmare/@anm2", "@progressAnm2" ]
},
"players": {
"root": [ "player/@name", "player/hair/@gfx" ],
"portraitroot": [ "player/@portrait" ],
"bigportraitroot": [ "player/@bigportrait" ]
},
"preloads": {
"": [ "preload/png/@path" ]
},
"sounds": {
"root": [ "sound/sample/@path" ]
},
"stages": {
"root": [ "stage/@path" ],
"bossgfxroot": [ "stage/@playerspot", "stage/@bosspot" ]
},
"entities": {
"anm2root": [ "entity/@anm2path" ]
}
}
i have basically implemented it, but i either guessed wrong which unused int is the hash, used the wrong hash function or did something else wrong.
will look into it tomorrow, it’s 3am here.
maybe someone of you wants to have a look? flying-sheep/BoIRResourceDecryption@a7fbb691dffbf8619390aca9628bb9457ffba2ab
i didn’t yet address the special cases i mentioned in the last comment.
With that and referenced strings I got 567 filenames decoded out of 2,748 total. I tried using the second hash(the key integer after the first hash) but that didn't increase the number.
what do you mean with “referenced strings”?
new commit makes things prettier, faster, and finds more (i think) flying-sheep/BoIRResourceDecryption@75c548c59761dcfa98711b377345800e0f7afb83
when calling
i see many lines like
resources/gfx/Effects/Effect_Xray_Cathedral.png
. i think those are used in some call likeloadResource(const char * path)
which maps them to offsets in the resource files.there must be some table in the resource file or binary that maps the file names to records in the archive which we could use to assign names to the records.