PennyDreadfulMTG / Penny-Dreadful-Tools

A suite of tools for the Penny Dreadful MTGO community
https://pennydreadfulmagic.com
MIT License
41 stars 28 forks source link

Analysis of game logs #790

Open silasary opened 7 years ago

silasary commented 7 years ago

PDBot records game logs. Do we want to do something with them?

bakert commented 6 years ago

I messaged /u/mtgratingtester on reddit who used to do this for mtg goldfish before wotc asked them to stop. They obviously won’t care about pd. Would be good to start with something that exists. I’ll also try Josh Silvestri who published similar stuff on cfb for a while. We know each other by sight (but no more) so that might be an “in”.

bakert commented 6 years ago

I messaged Josh. No reply yet from either.

@silasary – Can you attach some examples?

bakert commented 6 years ago

Things that can be analyzed here:

I guess basically we would store {match_id,card,deck_id} for every card ever played. Then you can run whatever queries you want.

bakert commented 6 years ago

I remembered the bot could give me these. Some examples:

192353549.txt 192298930.txt 190352623.txt

bakert commented 6 years ago

Basic plan is probably:

bakert commented 6 years ago

A more complex problem is to tie existing logs to matches but that shouldn't be too hard given that we know the result and both opponents and which match came before which in each place. We may have to reject some due to ambiguity.

bakert commented 6 years ago

More thinking.

I think we store an order number (what number action in the game this was) and (chat|action) enum so we can elide chat unless you were in the game or are an admin. (Or should all chat be considered public because the game was public? I think it's weird if people can view it retroactively when the only "person" there besides your opponent was pdbot.)

bakert commented 6 years ago

Features:

silasary commented 6 years ago

Everything modo bot needs to do has been done.

The analysis should be happening based on logsite's side.

bakert commented 5 years ago

pdlogs.match.id == decksite.match.mtgo_id for when we want to tie it up to people. Also mtgo_username I guess.

bakert commented 5 years ago

I linked the matches off /person/{id}/matches/. Step one.

bakert commented 5 years ago

The most basic unit we're interested in here is "{mtgo_username} casts [{cardname}]". I'm just gonna process those to start with.

bakert commented 5 years ago

Improvement: record the turn against each play.

bakert commented 5 years ago

There's a tiny amount of data where we have the match id but not the log. Not important.

mysql> SELECT COUNT(*) FROM `match` WHERE mtgo_id AND mtgo_id NOT IN (SELECT match_id FROM logsite.game);
+----------+
| COUNT(*) |
+----------+
|       91 |
+----------+
1 row in set (0.10 sec)

mysql> SELECT COUNT(*) FROM `match` WHERE mtgo_id AND mtgo_id IN (SELECT match_id FROM logsite.game);
+----------+
| COUNT(*) |
+----------+
|    49015 |
+----------+
1 row in set (0.20 sec)

We have none of the reverse.

mysql> SELECT COUNT(*) FROM logsite.game WHERE match_id NOT IN (SELECT mtgo_id FROM `match`);
+----------+
| COUNT(*) |
+----------+
|        0 |
+----------+
1 row in set (0.24 sec)
bakert commented 5 years ago

This is a pretty large amounts of data.

Naïve consolidation queries on small amounts of it take minutes.

[logsite.game]

game_id,
log

=>

[logsite._game_card_person]

game_id,
name,
mtgo_username

=>

[decksite._played_card_person_stats]

name,
person_id,
season_id,
wins,
draws,
losses

Building _played_card_person_stats is super expensive. Indexes might help.

Will have to think of something smarter or do everything piecemeal.

bakert commented 5 years ago
mysql>  SELECT
    ->             *
    ->         FROM
    ->             logsite._game_card_person AS gcp
    ->         INNER JOIN
    ->             person AS p ON gcp.mtgo_username = p.mtgo_username
    ->         INNER JOIN
    ->             logsite.game AS g ON g.id = gcp.game_id
    ->         INNER JOIN
    ->             `match` AS m ON m.mtgo_id = g.match_id
    -> ;
Empty set (4 min 38.79 sec)
jgabrielygalan commented 4 years ago

Possible other interesting stats: average number of turns per game. If we can link it to the decks, which we can now, we can crunch those averages by archetype or any other combination similar to the matchups page.

jgabrielygalan commented 4 years ago

I think the first thing to do would be to structure the data, so when PDBot sends the lines to the DB have a processor that can extract structured info like turn number, card played or action taken, so they can be aggregated later by those keys. Answering questions like: what's the average number of times someone has activated Rofellos? Or what's my win record when I cast Chain Lightning twice in a game?

All in all I think we could build a pretty complex an interesting system for analysis.