get_abilities, the first query I've built for the moveset report, is really slow when operating on a full dataset. And when you think about it, that makes sense, because it's still having to scan through every Pokemon used in every battle in the metagame-month, since the filter is on a group-concat, which I suspect is a rather expensive operation (especially compared to a straight table lookup). OTOH, subsequent runs in the same session are quite a bit faster, thanks, I'm guessing, to the subquery being cached.
So the question is whether I want to let this ride, write a set of queries that work, and not worry about performance until I need to (there's merit to that when I don't know what my performance targets are or whether I'm already meeting them) or do I redesign now so I potentially don't have to rewrite everything further down the line?
Basically, I can think of two alternative strategies:
Create a view / temp table / permanent table of weighted counts per SID for each metagame and weighting, add an indexed column for forme-concatenation, and then run my moveset queries against that.
Advantages:
Especially given the forme-concat column, I expect this to execute much faster
This table is what I envisioned creating for the public "deep query" API
Disadvantages:
Startup cost
Workflow needs a startup step
Maintenance / organization burden for the derived tables
Limited to preset weights
Including startup cost, it's possible that this won't be significantly faster than what I have now, depending on how caching works
Generate moveset reports all at once. That is, get abilities for all Pokemon in the tier at once by just doing one master group-by-aggregate for each type of report
Advantages:
Should be the most performant option, as it cuts down massively on the number of queries
Disadvantages:
Completely changes the workflow I envisioned
I mean, I think through the process of writing this up, I've come up with my answer, but I am going to sit on it for a minute.
get_abilities
, the first query I've built for the moveset report, is really slow when operating on a full dataset. And when you think about it, that makes sense, because it's still having to scan through every Pokemon used in every battle in the metagame-month, since the filter is on a group-concat, which I suspect is a rather expensive operation (especially compared to a straight table lookup). OTOH, subsequent runs in the same session are quite a bit faster, thanks, I'm guessing, to the subquery being cached.So the question is whether I want to let this ride, write a set of queries that work, and not worry about performance until I need to (there's merit to that when I don't know what my performance targets are or whether I'm already meeting them) or do I redesign now so I potentially don't have to rewrite everything further down the line?
Basically, I can think of two alternative strategies:
Create a view / temp table / permanent table of weighted counts per SID for each metagame and weighting, add an indexed column for forme-concatenation, and then run my moveset queries against that.
Advantages:
Disadvantages:
Generate moveset reports all at once. That is, get abilities for all Pokemon in the tier at once by just doing one master group-by-aggregate for each type of report
Advantages:
Disadvantages:
I mean, I think through the process of writing this up, I've come up with my answer, but I am going to sit on it for a minute.