ata4 / disunity

An experimental toolset for Unity asset and asset bundle files.
The Unlicense
2.69k stars 662 forks source link

Prototype UnityFS bundle reader support #200

Closed RupW closed 7 years ago

RupW commented 7 years ago

As a first step towards issue #172. I don’t think this is really good enough to include as-is, but it appears to works and there's some design decisions needed that you should probably make on how to capture / structure the data and how to support LZ4.

The headlines for UnityFS are:

For now I have shoe-horned the data into the existing structures, except I've added a new BundleInfoEntryFS since they are serialised differently. Might be worth considering whether to stick to a single set of structures or separate out new UnityFS classes?

To decompress LZ4 I have imported a subset of lz4-java, the native Java known-decompressed-size version. I did that rather than add a dependency because the original is likely overkill (a 200K jar, including optimised JNI versions for multiple platforms) and we’re only decompressing the data header with this. But that may be the correct solution long-term.

ciuppa commented 7 years ago

Is this something that I can try out? I'm not sure how to build the project but I have a rather large sample set that I can test this against.

RupW commented 7 years ago

I'm afraid it only unpacks as far as disunity's current bundle unpack: usually into a single asset file, or an asset file and an attached resource file, no further. If that's any use to you, or if you're just offering to test the unpacking against a large number of files, then I can put a build somewhere for you yes - probably upload a compiled .jar to my fork-project here on GitHub. But if you were hoping to extract meshes etc. then I'm afraid disunity 0.5 doesn't go that far yet sorry.

To build, you'll need a Java 1.8 JDK and Apache Maven. Then it's just get a git checkout and run mvn package from that directory, and it will build a new disunity.jar into disunity-dist\target. Or I can put a compiled jar somewhere for you if you want.

ciuppa commented 7 years ago

I'm offering to test against a large set of files.

Cheers, Brian

On Aug 28, 2016 1:10 PM, "Rupert Wood" notifications@github.com wrote:

I'm afraid it only unpacks as far as disunity's current bundle unpack: usually into a single asset file, or an asset file and an attached resource file, no further. If that's any use to you, or if you're just offering to test the unpacking against a large number of files, then I can put a build somewhere for you yes - probably upload a compiled .jar to my fork-project here on GitHub. But if you were hoping to extract meshes etc. then I'm afraid disunity 0.5 doesn't go that far yet sorry.

To build, you'll need a Java 1.8 JDK http://www.oracle.com/technetwork/java/javase/downloads/index.html and Apache Maven https://maven.apache.org/. Then it's just get a git checkout and run mvn package from that directory, and it will build a new disunity.jar into disunity-dist\target. Or I can put a compiled jar somewhere for you if you want.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ata4/disunity/pull/200#issuecomment-242986325, or mute the thread https://github.com/notifications/unsubscribe-auth/AGhJuUvmIK9d66y1j0TpumXjNV9zjACrks5qkcEegaJpZM4Ji-m0 .

jorgerobles commented 7 years ago

Question: So, after compiling , the command should be, "java -jar -Xmx8192m disunity.jar bundle unpack {the name of the UnityFS file}" ?

Does it work on StreamingAssets? Some apps comes with assetbundle files "decompressed" on folder tree structure, and the assets inside the folder are UnityFS files too

Thanks

RupW commented 7 years ago

Question: So, after compiling , the command should be, "java -jar -Xmx8192m disunity.jar bundle unpack {the name of the UnityFS file}" ?

Yes, same as normal for disunity. (I'm not sure you need to up the Xmx - it ought to stream data through rather than load it all into memory - but I don't know the LZMA library well.)

Does it work on StreamingAssets? Some apps comes with assetbundle files "decompressed" on folder tree structure, and the assets inside the folder are UnityFS files too

I don't know: I've only added support for understanding the new header. If disunity handled this for UnityWeb then it might just work for UnityFS too. If not, if you can point me at a sample file then I can have a look, but I can't promise how quickly (sorry!)

jorgerobles commented 7 years ago

https://mega.nz/#F!zwYWUSBC!jpXt0w3_p7W-pE84CAjKBQ

ata4 commented 7 years ago

I'll merge this, but I should mention at this point that I'm working on a Python re-implementation of Disunity that works much better with Unity's dynamic nature, so I don't think I'll continue to work on this repository. It currently requires about a seventh of the current Java code base (920 vs 6600 LOC, not counting dependencies) and already has working deserialization support. However, I'm currently working on UnityFS support, so this will definitely help me to get further.

ciuppa commented 7 years ago

What is the name of the Disunity Python implementation?

ata4 commented 7 years ago

The module is currently called pynity, but I still have to find a project name. I guess DisunityPy or something.

RupW commented 7 years ago

I think he meant "where can I get it".

Thanks for the merge! I'll still have a look at Jorge's streaming UnityFS when I get a chance, and put together a build that'll report better for any bits it doesn't understand - but I'm busy this weekend :-(

ata4 commented 7 years ago

I also found this Japanese document during my research, in case it can fill some gaps about the UnityFS format.

And yeah, the Python code isn't public yet, since it's in a very early stage right now and may undergo major changes. But if there's some interest in collaboration, I can change that any time, of course.

RupW commented 7 years ago

D'oh, I hadn't seen that. That's more or less the same as I know:

ata4 commented 7 years ago

Regarding structs, here's what I have from the .pdb files:

struct ArchiveStorageHeader::Header
{
   class std::basic_string<char,std::char_traits<char>,stl_allocator<char,54,16> > signature;
   int version;
   unsigned long Padding12;
   class std::basic_string<char,std::char_traits<char>,stl_allocator<char,54,16> > unityWebBundleVersion;
   class std::basic_string<char,std::char_traits<char>,stl_allocator<char,54,16> > unityWebMinimumRevision;
   __int64 size;
   int compressedBlocksInfoSize;
   int uncompressedBlocksInfoSize;
   int flags;
};

struct ArchiveStorageHeader::BlocksInfo
{
   unsigned char uncompressedDataHash[16];
   class std::vector<ArchiveStorageHeader::StorageBlock,stl_allocator<ArchiveStorageHeader::StorageBlock,54,16> > storageBlocks;
};

struct ArchiveStorageHeader::StorageBlock
{
   int uncompressedSize;
   int compressedSize;
   short flags;
};

struct ArchiveStorageHeader::DirectoryInfo
{
   class std::vector<ArchiveStorageHeader::Node,stl_allocator<ArchiveStorageHeader::Node,54,16> > nodes;
};

struct ArchiveStorageHeader::Node
{
   __int64 offset;
   __int64 size;
   int flags;
   unsigned long Padding5;
   class std::basic_string<char,std::char_traits<char>,stl_allocator<char,54,16> > path;
};
ata4 commented 7 years ago

Also, I'm currently stuck at the LZ4 decompression. Both the C and Python version always report invalid data at byte 4 when I try to decompress the blocks info, but your Java version seems to work. Are there customizations on the algorithm or am I doing something wrong?

Edit: never mind, it's just Python and I also found the reason. Looks like it will be a bit trickier than I initially thought.

GreenReaper commented 7 years ago

If you want any more sample files, the Disney Zootopia Crime Files Hidden Object app seems to use these. Free download on Windows 10, files are in C:\Program Files\WindowsApps\Disney.ZootopiaCrimeFilesHiddenObject_[version string]\Data [StreamingAssets] [open an admin command prompt and launch Explorer from the WindowsApps folder to access]

C:\Users[username]\AppData\Local\Packages\Disney.ZootopiaCrimeFilesHiddenObject_[random ID]\LocalState\bundle\case[X]

ei-hn commented 7 years ago

@ata4 I'd like to help with the python port, where do I sign up?

ata4 commented 7 years ago

@ei-hn I'm not yet really satisfied with the current code base and want to apply some changes before uploading it to Github. Could take a few more weeks.

jleclanche commented 7 years ago

I'll merge this, but I should mention at this point that I'm working on a Python re-implementation of Disunity that works much better with Unity's dynamic nature, so I don't think I'll continue to work on this repository.

Hey @ata4 in case you're not aware, a few people and myself at HearthSim have been working on exactly that. It's called UnityPack and it can currently read a ton of unity files, including UnityFS/lz4-compressed files.

It doesn't currently support reserialization. I'd love your help with the library, there's a lot to do still but it's served us well enough here for Hearthstone. Let me know if you're interested.

https://github.com/hearthsim/unitypack