happyleavesaoc / aoc-mgz

Age of Empires II recorded game parsing and summarization in Python 3.
MIT License
197 stars 42 forks source link

Populate `rate_snapshot` attribute of each player with POSTGAME ope data #95

Closed Guimoute closed 1 year ago

Guimoute commented 1 year ago

These changes populate the rate_snapshot attribute of each player so that they are accessible easily using the summary object.

They were tested with the code snippet below on a game from the most recent patch (78757).

with open(path, "rb") as data:
    summary = mgz.summary.Summary(data)

for player in summary.get_players():
    print(player["name"], player["rate_snapshot"])
happyleavesaoc commented 1 year ago

Thanks!

Guimoute commented 1 year ago

Hello! I found a bug that I cannot explain: the ratings are sometimes swapped. CaptureAge has the same behavior.

Should we sort the players found in op_data by a certain criterion before assigning the ratings? By user_id? I also do not understand why the player numbers in the op_data and in the Summary are different. If it's simply 0-indexed vs 1-indexed, I can add a bugfix to my PR.


Tests: Both those games have swapped ratings:

Swapped Elo1

# op_data
{'leaderboards': [{'id': 3, 'players': [{'number': 1, 'rank': 2853, 'rating': 1544},
                                        {'number': 0, 'rank': 4378, 'rating': 1451}]}],
                   'world_time': 2394348}

# player["name"], player["rate_snapshot"]
Guimoute 1544
MaJoR_Kd 1451

# summary.get_players()
[{'name': 'Guimoute',
  'number': 1,
  'color_id': 4,
  'user_id': 26xxxxx,
   ...},
 {'name': 'MaJoR_Kd',
  'number': 2,
  'color_id': 0,
  'user_id': 28xxxx,
   ...}
]

Swapped Elo 2

# op_data
{'leaderboards': [{'id': 3, 'players': [{'number': 1, 'rank': 4027, 'rating': 1471},
                                        {'number': 0, 'rank': 3822, 'rating': 1483}]}],
                   'world_time': 1128855}

# player["name"], player["rate_snapshot"]
BigAl 1471
Guimoute 1483

# summary.get_players() 
[{'name': 'BigAl',
  'number': 1,
  'color_id': 3,
  'user_id': 24xxxx,
  ...},
 {'name': 'Guimoute',
  'number': 2,
  'color_id': 4,
  'user_id': 26xxxxx,
   ...}
]
happyleavesaoc commented 1 year ago

There are two ways to reference players: 1) by index 2) by number

Index is the actual order of players. Number is the player-selected color. These two can be different! For example, player at index 0 can have number 8. I would recommend determining which scheme the postgame ratings use and then match accordingly. You could also match on profile ID.

Guimoute commented 1 year ago

The number attribute of a Player is not the player-selected color.

Let me rephrase the problem:

Anyway, I have found a simple fix since yesterday: sorting the data by ascending number. It gives the correct result, tested on both replays that previously gave the correct ratings and swapped ratings.

elif op_type is fast.Operation.POSTGAME:
    players_data: list = op_data["leaderboards"][0]["players"]
    players_data.sort(key=lambda player: player["number"]) # <--- Fixes the sometimes swapped ratings.
    for player, player_data in zip(players.values(), players_data):
        player.rate_snapshot = player_data["rating"]
happyleavesaoc commented 1 year ago

Can you provide a correct replay and an incorrect replay?

Guimoute commented 1 year ago

Sure!

game1 correct.zip game2 incorrect.zip

happyleavesaoc commented 1 year ago

@Guimoute can you confirm:

game1 correct.zip:

game2 incorrect.zip:

I'm a bit confused because https://github.com/happyleavesaoc/aoc-mgz/pull/95#issuecomment-1498204721 says the MaJoR_Kd game is incorrect.

Guimoute commented 1 year ago

Hmm, I apparently made a mistake when I renamed and zipped the files... Sorry 11

Game 1 is incorrect and game 2 is correct. The games were played back to back so my correct ratings are the one with the smallest jump (1470 to 1451), the opponents are 1544 and 1394.

happyleavesaoc commented 1 year ago

@Guimoute thanks, makes more sense. This should do it:

by_number = {x["number"]: x["rating"] for x in op_data["leaderboards"][0]["players"]}
for player in players.values():
    player.rate_snapshot = by_number.get(player.number - 1)

It looks like the player's number minus 1 matches the postgame's player's number field. I think this is because the postgame structure doesn't count gaia (which is normally number/index 0).

edit: I forgot to note that your solution works, except if there's a rating missing for a player.

happyleavesaoc commented 1 year ago

I'll push a new release with this change unless you believe it is incorrect still.

Guimoute commented 1 year ago

I tested on a few more games (recent ones) and it seems to work, so we are good to go! Thank you for the support.