Open FullLifeGames opened 1 year ago
Yep, there are lots of schemas from previous games. Saw your pull request, how are you planning to detect which files to use which schemas on, when the files inside the trpak's don't have filenames?
Yep, there are lots of schemas from previous games. Saw your pull request, how are you planning to detect which files to use which schemas on, when the files inside the trpak's don't have filenames?
So for me, the extraction of the trpfs and trpak files also resulted in filenames. I guess your script works better than expected :D
Yep, there are lots of schemas from previous games. Saw your pull request, how are you planning to detect which files to use which schemas on, when the files inside the trpak's don't have filenames?
So for me, the extraction of the trpfs and trpak files also resulted in filenames. I guess your script works better than expected :D
How many files did that extract? There should be a total of 273873 files as of version 1.0.1.
Yep, there are lots of schemas from previous games. Saw your pull request, how are you planning to detect which files to use which schemas on, when the files inside the trpak's don't have filenames?
TRPAK files do have file names, but they are stored within the code. See these three, for example:
fnv1a64('avalon/data/personal_array.bin')
fnv1a64('avalon/data/tokusei_array.bin')
fnv1a64('avalon/data/waza_array.bin')
The resulting hashes should be within the TRPFD hash table, and also within the TRPAK hash table.
How many files did that extract? There should be a total of 273873 files as of version 1.0.1.
My bad then, I only get around 17000 files. I was then too overexcited, since the language tables are present in e.g. arc\messagedat\English\common\
and can be extracted.
Examples are stuff like monsname.tbl
How many files did that extract? There should be a total of 273873 files as of version 1.0.1.
My bad then, I only get around 17000 files. I was then too overexcited, since the language tables are present in e.g.
arc\messagedat\English\common\
and can be extracted.Examples are stuff like
monsname.tbl
monsname.tbl
is an AHBT file that should only contain fnv1a64 hashes and names.
monsname.dat
is what you want, and from the sounds of it, that wasn't extracted? Or are the TBL and DAT files appended together...?
monsname.tbl
is an AHBT file that should only contain fnv1a64 hashes and names.monsname.dat
is what you want, and from the sounds of it, that wasn't extracted? Or are the TBL and DAT files appended together...?
That makes a lot of sense. When I compared it with the SWSH dump, I was wondering why there was no ".dat" file. Since some of them are extractable (modified pkNX a bit), I assume then they are appended together or something?
Anyway, seems like this repo needs a lot more work on the extraction side.
monsname.tbl
is an AHBT file that should only contain fnv1a64 hashes and names.monsname.dat
is what you want, and from the sounds of it, that wasn't extracted? Or are the TBL and DAT files appended together...?That makes a lot of sense. When I compared it with the SWSH dump, I was wondering why there was no ".dat" file. Since some of them are extractable (modified pkNX a bit), I assume then they are appended together or something?
Anyway, seems like this repo needs a lot more work on the extraction side.
Essentially, how the format works is that the code hashes a file path, that hash is then found in the trpfd hash table, then the trpfd map table is read to obtain the index into the trpfd key table and file table. The code then hashes the key from the key table, that hash is then found in the trpfs hash table, the associated offset vector value is then obtained, and then the size from the file table is used for reading. Lots of overcomplicated bullshit, when they could just get rid of the fnv1a64 hashes...
Ugh, and I even forgot about then searching the trpak hash table for the file path hash, and then decompressing the file data using Oodle...
Essentially, how the format works is that the code hashes a file path, that hash is then found in the trpfd hash table, then the trpfd map table is read to obtain the index into the trpfd key table and file table. The code then hashes the key from the key table, that hash is then found in the trpfs hash table, the associated offset vector value is then obtained, and then the size from the file table is used for reading. Lots of overcomplicated bullshit, when they could just get rid of the fnv1a64 hashes...
Since I did not write the extraction for the trpfd and trpfs, I only kinda get where you are coming from. I find the file hashes and the keys, but what I am stuck at are where the filename for the 273873 files should be located. For the 16112 trpak files, this is doable but the rest seems ... not really available.
Essentially, how the format works is that the code hashes a file path, that hash is then found in the trpfd hash table, then the trpfd map table is read to obtain the index into the trpfd key table and file table. The code then hashes the key from the key table, that hash is then found in the trpfs hash table, the associated offset vector value is then obtained, and then the size from the file table is used for reading. Lots of overcomplicated bullshit, when they could just get rid of the fnv1a64 hashes...
Since I did not write the extraction for the trpfd and trpfs, I only kinda get where you are coming from. I find the file hashes and the keys, but what I am stuck at are where the filename for the 273873 files should be located. For the 16112 trpak files, this is doable but the rest seems ... not really available.
Those file names are stored within the code, not anywhere in the data filesystem.
If you uncompress the main
file found in the exefs, you should be able to find the strings I mentioned: avalon/data/
, personal_array.bin
, tokusei_array.bin
and waza_array.bin
.
Here are the flatbuffer schemas I created while looking into this: https://anonfiles.com/y347n7H8yf/trinity_7z
Since the trpfd/trpak hashes are only in the code, I'd highly recommend just extracting the data like so: <sanitized trpfs key>/<trpak hash>
.
At least until someone runs over all the strings in the code and creates a list of correct file paths.
Those file names are stored within the code, not anywhere in the data filesystem. If you uncompress the
main
file found in the exefs, you should be able to find the strings I mentioned:avalon/data/
,personal_array.bin
,tokusei_array.bin
andwaza_array.bin
.
Sorry that I have to keep asking, but this is all rather new to me. So what I know is that you can load the exefs main file into Ghidra and modify it there. I did not know that there are ways to compress the contents from it.
Found something like this https://github.com/0CBH0/nsnsotool but it at least does not seem to go the full way.
Here are the flatbuffer schemas I created while looking into this: https://anonfiles.com/y347n7H8yf/trinity_7z Since the trpfd/trpak hashes are only in the code, I'd highly recommend just extracting the data like so:
<sanitized trpfs key>/<trpak hash>
. At least until someone runs over all the strings in the code and creates a list of correct file paths.
The flatbuffer schemas match with the ones we use here as well, so this is good.
Those file names are stored within the code, not anywhere in the data filesystem. If you uncompress the
main
file found in the exefs, you should be able to find the strings I mentioned:avalon/data/
,personal_array.bin
,tokusei_array.bin
andwaza_array.bin
.Sorry that I have to keep asking, but this is all rather new to me. So what I know is that you can load the exefs main file into Ghidra and modify it there. I did not know that there are ways to compress the contents from it.
Found something like this https://github.com/0CBH0/nsnsotool but it at least does not seem to go the full way.
Here are the flatbuffer schemas I created while looking into this: https://anonfiles.com/y347n7H8yf/trinity_7z Since the trpfd/trpak hashes are only in the code, I'd highly recommend just extracting the data like so:
<sanitized trpfs key>/<trpak hash>
. At least until someone runs over all the strings in the code and creates a list of correct file paths.The flatbuffer schemas match with the ones we use here as well, so this is good.
For NSO uncompression, I would recommend hactool.
For NSO uncompression, I would recommend hactool.
Oh, there we go. (For others, using HxD Hex Editor and searching for e.g. "personal_array.bin", you are going to find it.
Gotcha, now I also get the challenge of trying to extract them.
I'd highly recommend just extracting the data like so:
<sanitized trpfs key>/<trpak hash>
. At least until someone runs over all the strings in the code and creates a list of correct file paths.
With https://github.com/psthrn42/SCVI_Extract/pull/5 the data will now be stored in a similar format to the one you mentioned above. Thanks for all the input!
@bWFpbA @FullLifeGames This was actually already pretty much how trpak_extract worked. Guess I didn't look hard enough at your full_extract PR before merging lol. I'll have a look at your new PR.
On a different note, @bWFpbA did you know that this game includes a bunch of BFBS files? They basically contain all the information about the original fbs the devs used including object and field names, etc, and can be used to reconstruct it almost exactly. Think we got really lucky with that one. Unfortunately I couldn't find any for models or anims or anything (though p sure the anims are very similar to previous games, they sorta open in switch toolbox), but there is a massive one for the pokemon ai for example. Grep for BFBS and you should see them all.
And thanks for the better tpfd schema
Also, one more question. I haven't had a look at the exefs yet, but are all the paths really hardcoded in? In previous pokemon games we needed to pretty much guess most of the hashes (i think).
On a different note, @bWFpbA did you know that this game includes a bunch of BFBS files? They basically contain all the information about the original fbs the devs used including object and field names, etc, and can be used to reconstruct it almost exactly. Think we got really lucky with that one. Unfortunately I couldn't find any for models or anims or anything (though p sure the anims are very similar to previous games, they sorta open in switch toolbox), but there is a massive one for the pokemon ai for example. Grep for BFBS and you should see them all.
I did see all the BFBS files, though I don't think any were of interest to me - I was mainly digging through the files to get to the personal table, and I don't think that had a BFBS file.
Also, one more question. I haven't had a look at the exefs yet, but are all the paths really hardcoded in? In previous pokemon games we needed to pretty much guess most of the hashes (i think).
I'm not sure about the previous games, but I believe most of them are in the code? I probably wouldn't have found the personal table otherwise, since they chose to store it in the avalon/
directory rather than the usual pml/
directory.
Do BFBS files have a documented structure? I've never looked at them before. I see that they are also a flatbuffer (pain). Currently trying to figure out the trainer BFBS.
EDIT: Heh, I don't think I've ever had much fun reverse engineering flatbuffers, but the BFBS one is quite fun - a table within a table within a struct within a table within either a vector of tables or just a table is quite fun to look at in a hex editor.
EDIT 2: Ugh, and now I learn that the main flatbuffers repository has a reflection.fbs
file that is the schema for the BFBS format. Oh well, it was fun while it lasted.
is there a way to extract the Pokemon models?
@psthrn42 apparently there are schemas already created for Legends Arceus which might help use here:
https://github.com/rh-hideout/rhh-docs/tree/main/NX/flatbuffers/LA
Will be going to bed, maybe I'll have a look in the morning.