HearthSim / python-hearthstone

Hearthstone Python library (CardDefs, DBF, enums, log parser)
MIT License
246 stars 62 forks source link

Decouple LogParser and PacketTree #8

Closed jleclanche closed 8 years ago

jleclanche commented 8 years ago

In trying to make LogParser more usable for the HSReplay parser, I'm discovering that there are actually three distinct parts to our parser:

Currently, the Packet tree and the Power.log parser are tightly coupled. This is why it intuitively feels easy to make HSReplay output a packet tree, but it really isn't.

The reason they are coupled is because in order to set the entity attributes on the packets, those entities need to exist as actual entity objects, so we continuously update the entity tree as we read the log. Part of the reasoning behind that, IIRC, was to have semi-reliable Player entities that would have their name on all packets by the end of the game.

To decouple all this, here is what I propose:

  1. Change all entity attributes on the various packets to only be integers, except for players. This will be a tough change, which will require updating the XML dumper as well.
  2. Change Player entities to a lazy int so that it can be treated as an int, but it looks for its value in a known place.
  3. Add a mechanism to export a packet tree (or part of a packet tree?) to an entity tree.
  4. Remove the immediate updates on the game tree, replace it with the export
  5. Rework LogWatcher to base itself on the packet tree's export/update mechanism

Then, as long as we can export HSReplayDocument to a packet tree, we can use LogWatcher alongside it.

jleclanche commented 8 years ago

Alright, progress, finally.

As of 8bf464c5d46cac56413b1326331f001915ae941c:

Here's what we lost:

jleclanche commented 8 years ago

Exporting PacketTree to an entity tree is done. I'm starting to test the parser with existing logs now.

Something really annoying I just realized: Mulligan-based player name registration no longer works, because it relies on CONTROLLER. Urgh. We could track a simplified entity tree (one that doesn't update, just registers all entities and their controllers) but that's more complexity just for the sake of player names.

I'm thinking an easier solution is to do player name registration completely out of band. With the LazyPlayer system, we no longer need to do name registration on the fly. We used to, because we were buffering packets, which meant the parser would accumulate a bunch of packets until the player name was ready, and then flush it all out at once. Not ideal, especially for streaming

Right, so that's the other thing: As-is, we lost streaming capabilities. I think we can restore them pretty easily, there's just more work involved, as users need to maintain both the packet tree and the entity tree themselves. I'm happy not to have a helper for it (since we don't use this at all), as long as it's capable of doing it.

jleclanche commented 8 years ago

Thinking about this more, player names definitely need to be populated synchronously. Otherwise there's too much chicken-and-egg: You do an export of the game state, and that export needs player names. But if you need player names you need game state. ...

So, we'll have to track just enough for this. We'll keep an entity_id/controller map and wipe it once the player names are available.

jleclanche commented 8 years ago

Awesome!! PlayerManager is finished and working like a charm.

I reimplemented the ENTITY_ID hook using the controller lookup map. Super reliable. I also replaced the CURRENT_PLAYER hook with a LAST_CARD_PLAYED. It's less hacky, easier to follow and reuses the same controller map. A ton more reliable as CURRENT_PLAYER would not always update in the correct order. I also implemented an "aggressive" name registration hook, which, if there is a name already registered, will automatically guess which player a name belongs to without ever running any of the other hooks! It can do that because there can only be two players in a game. I'm not sure on its reliability since AI names can change throughout a game, so I've also added an ai_player attribute to the PlayerManager which we can look up.

With this, the new parser is complete, I think. I ran it over the entire hsreplay-test-data repo and I only got positive diffs (fixed bugs)!

The two diffs that jumped out:

I'll try to run both parsers on failed uploads from HSReplay.net and see how they both fare overall. Hoping to push it live tomorrow.

jleclanche commented 8 years ago

And with those last few commits, I am done.

Documentation of the new system:

I'll be testing it and pushing the system live tonight when it's a bit more quiet. Super proud of it.