Blizzard / heroprotocol

Python library to decode Heroes of the Storm replays
MIT License
398 stars 69 forks source link

HeroMasteryTier Hero Encoding #77

Closed MGatner closed 4 years ago

MGatner commented 5 years ago

In initdata the hero mastery array, m_hero is listed as a four character code, but the values are coming through garbled (even with non-ASCII escaped characters, like \x13). I’m not sure how heroprotocol is attempting to encode the string, but Barrett solves this with the following: https://github.com/barrett777/Heroes.ReplayParser/blob/3e461251ccf36d0aaad71daba2b4223e61c455b6/Heroes.ReplayParser/MPQFiles/ReplayInitData.cs#L305

Output should be the four-character “attribute_id” hero identifier.

MGatner commented 5 years ago

This one is baffling me. Someone who knows the inner workings of Python might have an idea, but the issue seems to be reading four 8-bit characters one at a time. For example, if I come across Ana her attribute_id is HANA so I'd expect [ 72 65 78 65 ] - but instead decoders read_unaligned_bytes returns "H\x13P\x91" [ 72 19 80 145 ]: 01001000 00010011 01010000 10010001

If I change the code to grab all 32 bits at once I get the expected integer 1212239425, which is: 01001000 01000001 01001110 01000001

read_unaligned_bytes is using chr which operates on a single byte so shouldn't be subject to endian conversion issues.

MGatner commented 5 years ago

Losing my mind here. Just to triple reconfirm the issue I just posted, I returned the bits directly using "{0:b}".format(bits), first by the four 8-bit chunks: 1001000 10011 1010000 10010001

And next as one 32-bit chunk: 1001000010000010100111001000001

In other words - exactly as I was already seeing. There's definitely something happening at the bit reader or file format level that I'm not understanding 'cause this is straight up screwball.

FWIW an easy workaround is to modify read_unaligned_bytes in decoders.py (line 83) to read 32-bits at once, then unpack that into its byte representation and read those as characters one at a time. I'm still at a loss how to convert the current output into anything usable.

yretenai commented 5 years ago

for curiousity's sake, what happens if you change read_unaligned_bytes to

def read_unaligned_bytes(self, bytes):
  bits = self.read_bits(8 * bytes)
  return ''.join([chr(bits[i:i+8]) for i in xrange(0, 8 * bytes, 8)])
MGatner commented 5 years ago

@healingbrew TypeError: 'int' object has no attribute 'getitem'

barrett777 commented 5 years ago

I think the issue is that after reading the first 8-bit chunk, the reader aligns itself to the next byte, skipping over some bits

You can see what I experienced in C# here: https://repl.it/repls/CloudyGrimPascal

Agilhardt commented 4 years ago

Fixed in the next update.

MGatner commented 4 years ago

@Agilhardt It appears that 2.50.1.79515 did not affect heroprotocol.py - did this fix get bumped?

Agilhardt commented 4 years ago

The fix is in decoders.py#L134 in fa99bf2. To test, parse a replay with --initdata and check m_syncLobbyState.m_lobbyState.m_slots[0].m_heroMasteryTiers.

Please let me know if it's not working elsewhere.

edit: --initdata, not --details.