HearthSim / UnityPack

Python deserialization library for Unity3D Asset format
https://hearthsim.info/
MIT License
720 stars 153 forks source link

Compressed UnityWeb AssetBundles ignores asset header #80

Open jnsnow opened 5 years ago

jnsnow commented 5 years ago

Hi, as seen in asset.py:

"# FIXME: this offset needs to be explored more"

I have some information on this; the header appears to look something like:

struct asset_header {
    uint32_t num_assets;
    string   name;
    uint32_t offset;
    uint32_t size;
    uint8_t  padding[3]; /* ??? */
}

I'm not very familiar with UnityWeb files (except the ones in front of my own face), but these all have num_assets = 1 and a names of "CAB-{SHA1}". I assume that the name/offset/size objects are repeated for num_assets, and then we pad out to the nearest 4 byte boundary.

I assume that this logic could be hoisted up to AssetBundle, instead of assuming that compressed bundles always have one asset, and then the Asset constructor can take something like a decompressed buffer slice for [offset:offset+size] and be given the name from the header.

I don't have good testing infrastructure, though, so I'm afraid I will cause a regression for files I don't have in front of my face. Suggestions welcome.