mikemccllstr / dominionstats

The code behind councilroom.com.
http://councilroom.com
11 stars 3 forks source link

Precalculate more stats for "popular buys per player" page to improve its performance #25

Open mikemccllstr opened 11 years ago

mikemccllstr commented 11 years ago

From Rob's post in councilroom-dev:

I suspect because a few of the pages are coded very poorly and traverse a whole users set of games to compute stats (popular buys per player, the player page, and the player vs other player record page all do this). When a a similarly poorly coded search engine comes around (these pages are blocked by robots.txt, but some search engines don't honor it), it just keeps putting too many requests on those uber slow pages and councilroom eventually runs out of memory and dies. The decision to do a scan of all the players games came when players had a few hundred game max, but now that there are crazy people with 10k games, and so the design is horribly broken. If someone pre-computed those pages, I am pretty sure a 486 could serve councilroom fine.

Not sure what is required to be precalculated in advance, but it is distinctly slow for users with large game counts.

mikemccllstr commented 11 years ago

As reported by Rabid on http://forum.dominionstrategy.com/index.php?topic=6106.msg165102#msg165102, looks like the breaking point with the current code, hosting, and data volume lies between 4500 and 5000 games:

Working

-Stef- (2738 games) http://councilroom.com/popular_buys?player=-Stef- TheSadPanda (4547 games) http://councilroom.com/popular_buys?player=TheSadPanda

Sometimes, Maybe, Works

papaHav (5086 games) http://councilroom.com/popular_buys?player=papaHav

Not Working

Rabid (8325 games) http://councilroom.com/popular_buys?player=Rabid chwhite (9968 games) http://councilroom.com/popular_buys?player=chwhite

rrenaud commented 11 years ago

Implementation advice for this.

The basic idea is to give every[1] player their own count_buys.DeckBuyStats entry in the buys table. The pre-computing logic in count_buys gets a bit more complicated, because instead of maintaining a single global count_buys.DeckBuyStats, it also has to manage a dictionary keyed by player of DeckBuyStats. If you are going to do this incrementally, you might want to use the same updating strategy as in analyze2.EventAccumulator. Compute starting with an empty dict, and before writing, do a read and merge of the existing data per player, then write out the computed data. If committing day by day, this let's you skip reading players data who didn't play on a given day.

[1]If db space is a premium, you might want to get a little fancier and use a hybrid on the fly for infrequent players/precomputed for frequent players strategy. Say only give players who have some threshold, say 50 games, their own entry in the buys column, but then you have to manage tracking #games per player and have some fancy logic to decide when you should start storing the data. I am not sure this is worth it.