HearthSim / UnityPack

Python deserialization library for Unity3D Asset format
https://hearthsim.info/
MIT License
721 stars 152 forks source link

Improve structs.dat handling #8

Open jleclanche opened 8 years ago

jleclanche commented 8 years ago

We should have a system which lets us load different structs.dat files for different unity versions. Newer ones aren't fully compatible with older ones.

capntrips commented 7 years ago

I'm not sure if this should go here, #6, get it's own issue, or some combination, so hopefully this is the proper forum.

As a new user to this (great!) software, it took me hours to figure out that if an AssetBundle doesn't provide TypeTrees, it will silently fall back on structs.dat for a potentially different format version. It might be worth adding a warning in that case, so others can avoid that hassle in the future.

As for acquiring an updated structs.dat, could you potentially elaborate on how you got it from disunity? I subscribed to that issue if you think that is a better place to discuss it.

Also, a related question. If an AssetBundle doesn't provide TypeTrees, are they stored somewhere in the target app's metadata?

In the interim, for those that run into something similar, I managed to dump the TypeTrees for specific types by performing the following steps:

  1. create an asset of the target type in the target version of Unity
  2. add it to an AssetBundle
  3. load the AssetBundle in UnityPack
  4. save the bytes for that specific object's TypeTree from here to a file

Note: This may work if you just export and individual asset, rather than a bundle, but I haven't investigated.

I was then able to use the bytes to create the trees for an AssetBundle without TypeTrees with the following code:

Texture2DTypeTree = TypeTree(asset.format)
Texture2DTypeTree.load(BinaryReader(BytesIO(typeBytes)))
asset._buf.seek(asset._buf_ofs + object.data_offset)
buf = BinaryReader(BytesIO(asset._buf.read(object.size)))
data = object.read_value(Texture2DTypeTree, buf)
jleclanche commented 7 years ago

I honestly don't remember how I got hold of structs.dat at this point. I'm very much in favour of a better system for handling the pile of mess that is builtin typetrees, especially since they change version to version. I suspect it's the root cause of a bunch of issues.

Maybe @ifeherva or @Mischanix can chime in.

ifeherva commented 7 years ago

"it took me hours to figure out that if an AssetBundle doesn't provide TypeTrees, it will silently fall back on structs.dat for a potentially different format version. "

I think this behavior is mentioned in the wiki. If not, it should be :)

andburn commented 7 years ago

I assume its generated with this Gist from @Mischanix

capntrips commented 7 years ago

Thanks for the link. I'll see if I can make it work for my target version.

On the silently falling back on a different format version thing, I was just thinking it might be nice to have it issue a warning by default if the asset version doesn't match the structs version. I'd be happy to submit a pull request if it's something you think you'd accept.

Thanks again, and thanks for the crazy fast replies.

capntrips commented 7 years ago

I managed to get the gist updated to:

The basic steps I used, in case someone wants to replicate this with a different version:

  1. Create an empty project in Unity and build it for Windows x86_64 with 'Copy PDB files' checked (but not Development Build, though that may not matter).
  2. Import the EXE into IDA, using the PDB for debug symbols.
  3. Update the offsets in the gist.
  4. Build Inject.exe and UnityStructGen.dll with the instructions from UnityStructGen.cpp.
  5. As administrator, with the empty project EXE running, inject the DLL (be sure to use the full path to the DLL).
  6. Use the resulting structs.dat, strings.dat, and classes.json (will be in the empty project's build folder).

Note: While the type trees are coming from later versions, they are still being written in format 15, so they'll only work if they are loaded as such.

Edit: Updated to 2017.1.2p4. Edit: Updated to 2017.1.3p4.

yanchl commented 6 years ago

@capntrips Seeing that you extract the structs.dat file , I really want to know how you got these offsets values.

capntrips commented 6 years ago

@yanchl Once you have the EXE and PDB loaded in IDA, you can find them all in the Names and Functions windows. For example, from 2017.1.0p5:

Variable Name IDA Name Offset
globalBuf aAabb_0 0x141137870
GenerateTypeTree GenerateTypeTree(Objectconst &,TypeTree &,TransferInstructionFlags) 0x1406F99F0
Object__Produce Object::Produce(Unity::Type const *,int,MemLabelIdentifier,ObjectCreationMode) 0x1401F9B60
TypeTree__TypeTree TypeTree::TypeTree(MemLabelIdentifier) 0x1406F6D60
unityVersion a2017_1_0p5 0x141189490
rttiData RTTI::RuntimeTypeArray RTTI::ms_runtimeTypes 0x1414C0AA0
mafaca commented 6 years ago

In Unity 2017.3 format has been changed. For those of you who need it I uploaded modified version here ver. 2017.3.0f3

kasubafe commented 6 years ago

For Unity 2017.4.0f1, the relevant offsets are as follows:

Variable Name Offset
globalBuf 0x181117970
GenerateTypeTree 0x18061dda0
Object__Produce 0x1802eb8c0
TypeTree__TypeTree 0x18061cf80
unityVersion 0x18110cc48
rttiData 0x181443d90

A gist with the updated version (code + dumped .dat files) based on @mafaca 's code, can be found at this gist

nanoNago commented 4 years ago

What's the right way to tell which version we need?

Let's say we start keeping multiple versions of structs.dat -- which value should we use to decide which we need?

I guess if we're looking at a .assets file and the TypeMetadata region declares generator_version "2018.4.11f1" we ought to load that version's built-in type definitions, right? (It should be based on the version you find at the root of the asset file?)

I'm not sure if there are cases where you might want to load multiple asset files that have different generator versions; if the implied struct definitions can change I'm assuming this is inherently "illegal". It looks like @mafaca has a project that's gotten quite far and succeeds in some places where UnityPack fails, so maybe they can help drop some knowledge.

I'm willing to put in a little work to get UnityPack back up to par, it's quite handy for scripting extraction of assets, especially on Non-windows platforms where relying on C# projects can be a little harder.

DaZombieKiller commented 4 years ago

Over the past two days I've been working on a tool that allows you to generate structs.dat, structs.dump, classes.json and strings.dat from within the Unity editor itself.

It makes use of the DIA SDK to load unity_x64.pdb and locate symbols at initialization time, instead of requiring you to hardcode everything. This means it's far more resistant to breakage between Unity versions.

https://github.com/DaZombieKiller/TypeTreeTools

Note that it's only been properly tested with Unity 2020.1.0a20 for now, but I hope someone finds it useful.

chenruikun commented 3 years ago

I'm not sure if this should go here, #6, get it's own issue, or some combination, so hopefully this is the proper forum.

As a new user to this (great!) software, it took me hours to figure out that if an AssetBundle doesn't provide TypeTrees, it will silently fall back on structs.dat for a potentially different format version. It might be worth adding a warning in that case, so others can avoid that hassle in the future.

As for acquiring an updated structs.dat, could you potentially elaborate on how you got it from disunity? I subscribed to that issue if you think that is a better place to discuss it.

Also, a related question. If an AssetBundle doesn't provide TypeTrees, are they stored somewhere in the target app's metadata?

In the interim, for those that run into something similar, I managed to dump the TypeTrees for specific types by performing the following steps:

  1. create an asset of the target type in the target version of Unity
  2. add it to an AssetBundle
  3. load the AssetBundle in UnityPack
  4. save the bytes for that specific object's TypeTree from here to a file

Note: This may work if you just export and individual asset, rather than a bundle, but I haven't investigated.

I was then able to use the bytes to create the trees for an AssetBundle without TypeTrees with the following code:

Texture2DTypeTree = TypeTree(asset.format)
Texture2DTypeTree.load(BinaryReader(BytesIO(typeBytes)))
asset._buf.seek(asset._buf_ofs + object.data_offset)
buf = BinaryReader(BytesIO(asset._buf.read(object.size)))
data = object.read_value(Texture2DTypeTree, buf)

could you pls share the code about how to save the bytes for that specific object's TypeTree to file?

capntrips commented 3 years ago

@chenruikun I was unable to find the code I used in that example, but I imagine it was something like this (untested):

def load_blob(self, buf):
    before = buf.tell()
    num_nodes = buf.read_uint()
    self.buffer_bytes = buf.read_uint()
    node_data = BytesIO(buf.read(24 * num_nodes))
    self.data = buf.read(self.buffer_bytes)
    after = buf.tell()
    buf.seek(before)
    with open('typetrees.dat', 'wb') as fp:
        fp.write(buf.read(after - before))

I later started saving them for the individual objects, rather than the entire AssetBundle, probably using a similar method in TypeMetadata.load, but again, I was unable to find the code I used to generate the files.

ds5678 commented 2 years ago

To anyone who might find this, I have an archive of the struct information for nearly every Unity version.

https://github.com/ds5678/TypeTreeDumps

Disclaimer: I did not check whether or not the binary struct files on my repository are in the exact same format as the structs.dat files used by UnityPack.