Open silasary opened 7 years ago
I messaged /u/mtgratingtester on reddit who used to do this for mtg goldfish before wotc asked them to stop. They obviously won’t care about pd. Would be good to start with something that exists. I’ll also try Josh Silvestri who published similar stuff on cfb for a while. We know each other by sight (but no more) so that might be an “in”.
I messaged Josh. No reply yet from either.
@silasary – Can you attach some examples?
Things that can be analyzed here:
I guess basically we would store {match_id,card,deck_id} for every card ever played. Then you can run whatever queries you want.
I remembered the bot could give me these. Some examples:
Basic plan is probably:
match
A more complex problem is to tie existing logs to matches but that shouldn't be too hard given that we know the result and both opponents and which match came before which in each place. We may have to reject some due to ambiguity.
More thinking.
I think we store an order number (what number action in the game this was) and (chat|action) enum so we can elide chat unless you were in the game or are an admin. (Or should all chat be considered public because the game was public? I think it's weird if people can view it retroactively when the only "person" there besides your opponent was pdbot.)
Features:
Everything modo bot needs to do has been done.
The analysis should be happening based on logsite's side.
pdlogs.match
.id == decksite.match
.mtgo_id for when we want to tie it up to people. Also mtgo_username I guess.
I linked the matches off /person/{id}/matches/. Step one.
The most basic unit we're interested in here is "{mtgo_username} casts [{cardname}]". I'm just gonna process those to start with.
Improvement: record the turn against each play.
There's a tiny amount of data where we have the match id but not the log. Not important.
mysql> SELECT COUNT(*) FROM `match` WHERE mtgo_id AND mtgo_id NOT IN (SELECT match_id FROM logsite.game);
+----------+
| COUNT(*) |
+----------+
| 91 |
+----------+
1 row in set (0.10 sec)
mysql> SELECT COUNT(*) FROM `match` WHERE mtgo_id AND mtgo_id IN (SELECT match_id FROM logsite.game);
+----------+
| COUNT(*) |
+----------+
| 49015 |
+----------+
1 row in set (0.20 sec)
We have none of the reverse.
mysql> SELECT COUNT(*) FROM logsite.game WHERE match_id NOT IN (SELECT mtgo_id FROM `match`);
+----------+
| COUNT(*) |
+----------+
| 0 |
+----------+
1 row in set (0.24 sec)
This is a pretty large amounts of data.
Naïve consolidation queries on small amounts of it take minutes.
[logsite.game]
game_id,
log
=>
[logsite._game_card_person]
game_id,
name,
mtgo_username
=>
[decksite._played_card_person_stats]
name,
person_id,
season_id,
wins,
draws,
losses
Building _played_card_person_stats
is super expensive. Indexes might help.
Will have to think of something smarter or do everything piecemeal.
mysql> SELECT
-> *
-> FROM
-> logsite._game_card_person AS gcp
-> INNER JOIN
-> person AS p ON gcp.mtgo_username = p.mtgo_username
-> INNER JOIN
-> logsite.game AS g ON g.id = gcp.game_id
-> INNER JOIN
-> `match` AS m ON m.mtgo_id = g.match_id
-> ;
Empty set (4 min 38.79 sec)
Possible other interesting stats: average number of turns per game. If we can link it to the decks, which we can now, we can crunch those averages by archetype or any other combination similar to the matchups page.
I think the first thing to do would be to structure the data, so when PDBot sends the lines to the DB have a processor that can extract structured info like turn number, card played or action taken, so they can be aggregated later by those keys. Answering questions like: what's the average number of times someone has activated Rofellos? Or what's my win record when I cast Chain Lightning twice in a game?
All in all I think we could build a pretty complex an interesting system for analysis.
PDBot records game logs. Do we want to do something with them?