Closed MGatner closed 4 years ago
This one is baffling me. Someone who knows the inner workings of Python might have an idea, but the issue seems to be reading four 8-bit characters one at a time.
For example, if I come across Ana her attribute_id is HANA so I'd expect [ 72 65 78 65 ] - but instead decoders read_unaligned_bytes returns "H\x13P\x91" [ 72 19 80 145 ]:
01001000 00010011 01010000 10010001
If I change the code to grab all 32 bits at once I get the expected integer 1212239425, which is:
01001000 01000001 01001110 01000001
read_unaligned_bytes is using chr which operates on a single byte so shouldn't be subject to endian conversion issues.
Losing my mind here. Just to triple reconfirm the issue I just posted, I returned the bits directly using "{0:b}".format(bits), first by the four 8-bit chunks: 1001000 10011 1010000 10010001
And next as one 32-bit chunk: 1001000010000010100111001000001
In other words - exactly as I was already seeing. There's definitely something happening at the bit reader or file format level that I'm not understanding 'cause this is straight up screwball.
FWIW an easy workaround is to modify read_unaligned_bytes in decoders.py (line 83) to read 32-bits at once, then unpack that into its byte representation and read those as characters one at a time. I'm still at a loss how to convert the current output into anything usable.
for curiousity's sake, what happens if you change read_unaligned_bytes
to
def read_unaligned_bytes(self, bytes):
bits = self.read_bits(8 * bytes)
return ''.join([chr(bits[i:i+8]) for i in xrange(0, 8 * bytes, 8)])
@healingbrew TypeError: 'int' object has no attribute 'getitem'
I think the issue is that after reading the first 8-bit chunk, the reader aligns itself to the next byte, skipping over some bits
You can see what I experienced in C# here: https://repl.it/repls/CloudyGrimPascal
Fixed in the next update.
@Agilhardt It appears that 2.50.1.79515
did not affect heroprotocol.py - did this fix get bumped?
The fix is in decoders.py#L134 in fa99bf2. To test, parse a replay with --initdata
and check m_syncLobbyState.m_lobbyState.m_slots[0].m_heroMasteryTiers
.
Please let me know if it's not working elsewhere.
edit: --initdata
, not --details
.
In initdata the hero mastery array, m_hero is listed as a four character code, but the values are coming through garbled (even with non-ASCII escaped characters, like \x13). I’m not sure how heroprotocol is attempting to encode the string, but Barrett solves this with the following: https://github.com/barrett777/Heroes.ReplayParser/blob/3e461251ccf36d0aaad71daba2b4223e61c455b6/Heroes.ReplayParser/MPQFiles/ReplayInitData.cs#L305
Output should be the four-character “attribute_id” hero identifier.