RAPM phases 2+ - Githubissues

Alex-At-Home commented 4 years ago

[x] Add use of offensive efficiency (def?) as a prior #130
- [x] Tidy up prior usage in edige cases (error/priorSum limit .. pick a max eg 50%) (https://github.com/Alex-At-Home/cbb-on-off-analyzer/commit/efeda0c877ee2af2f6373daa98382d7db4e846a6)
- [ ] (and add player averages as a prior for the other stats?)
[ ] Diagnostic mode
- [ ] Individual
- [x] Overall - #85 and #88
[ ] Fix season averages (+add for earlier years)
- [ ] In general have a look at the implementation/methodology for "other stats", not convinced
[ ] (Maybe experiment with alternative unbiased ridge regression techniques)
[x] Idea: Hybrid RAPM where you more heavily regress stats with good priors and calculate the points above average
[x] Luck adjustments #113
[x] Add RAPM to individual page #147
- Some "fast follow" tasks:
  - [x] Unify RAPM processing logic (team report/build leaderboards/individual) - #174
  - [ ] Add unit tests for RAPM mode
  - [ ] Maybe add link to diagnostics
  - [ ] Show original/overridden status when manual edits used
  - [ ] Speed up RAPM calcs (eg only need to calc over adj_ppp)
[x] Tweak RAPM to fit KenPom better: #171
[x] There's something weird going on with RAPM and luck, eg check out 20/21 Scott RAPM with luck enabled, it claims he has some crazy high non-luck RAPM (on individual page, haven't tried report page): #174
[x] For now I think pull use of luck out of RAPM lineup calcs (just leave them in the priors), I think the effect of not knowing the expected 3P% of the lineup based on player stats and shot distribution is just too high (related to #162): #174
- (would in theory also address the bullet above it, but would be good to understand the bug causing that first so it doesn't come back when I start adding it again)

Alex-At-Home commented 4 years ago

So currently the effect of my unbiasing is to dole out the adjusted margin in approx % of minutes played, which is not ideal

Alternatives:

Two parameter estimation: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4074997/
Less unbiased estimation: https://www.researchgate.net/publication/264911183_An_almost_unbiased_ridge_estimator
Estimate prior and use to unbias: http://www.utgjiu.ro/math/sma/v04/p08.pdfhttp://www.utgjiu.ro/math/sma/v04/p08.pdf
- (the J prior should apparently be estimated as sum(1,p)(OLS[i]/p)*Id(1, p)) where OLS[i] is the OLS-based estimate for player p
Come up with alternative priors (eg everyone gets the same %, dole them out according to off/def efficiency, dole them out according to rep on-off?)

(Useful page with some info about error estimation: https://stats.stackexchange.com/questions/216335/standard-error-for-a-parameter-in-ordinary-least-squares)

Alex-At-Home commented 4 years ago

Did some experiments here: https://github.com/Alex-At-Home/cbb-on-off-analyzer/tree/rapm_experiments

Alex-At-Home commented 4 years ago

UPDATED: fixed by https://github.com/Alex-At-Home/cbb-on-off-analyzer/commit/d6528a6ec2c0288ff973230156f8086e51cece0d

As I feared but stupidly didn't follow up on hard enough, you can run into trouble if the delta over average is near 0, eg here's the Xavier offense:

[Log] ********* [off] RAPM WITH LAMBDA 2.240 / 4
[Log] ["Marshall, Naji", "Carter, Jason", "Scruggs, Paul", "Jones, Tyrique", "Goodin, Quentin", "Freemantle, Zach", "Tandy, KyKy", "Moore, Bryce", "Bishop, Dahmir", "James, Dontarius"] (10)
[Log] rapm[PRE]: 0.761,-0.218,0.509,1.277,-0.048,-0.702,0.462,-0.848,-0.371,0.282
[Log] rapm[POST]: -4.968,4.822,-1.731,-9.310,7.790,-3.268,2.352,10.879,19.736,11.289
[Log] combinedRapm[PRE] = [1.2] vs actualEff = [1.7] ... Err[PRE] = [0.5] Err[POST]=[0.0]

RAPM comes up with a super sane estimate but to handle that 0.5 error, it adds lots of big +ve and -ve numbers together :(

RAPM Priors = [{"includeStrong":{},"playersStrong":[{},{},{},{},{},{},{},{},{},{}],"playersWeak":[{"off_adj_ppp":2.607,"def_adj_ppp":-2.046},{"off_adj_ppp":-2.294,"def_adj_ppp":-1.841},{"off_adj_ppp":1.02,"def_adj_ppp":-1.807},{"off_adj_ppp":4.819,"def_adj_ppp":-2.984},{"off_adj_ppp":-3.567,"def_adj_ppp":-1.312},{"off_adj_ppp":1.168,"def_adj_ppp":-1.401},{"off_adj_ppp":-0.86,"def_adj_ppp":-0.127},{"off_adj_ppp":-5.337,"def_adj_ppp":-1.411},{"off_adj_ppp":-9.151,"def_adj_ppp":-1.434},{"off_adj_ppp":-5.01,"def_adj_ppp":-0.012}]}]

In fact you'll see Goodin goes from -0.048 before to +7.790 with his prior of -3.567

Alex-At-Home commented 3 years ago

During testing of #171, seem to have found a bug

Compare Prim Gaige's leaderboard vals vs report vals:

http://localhost:3000/TeamReport?filter=GaPrim&gender=Men&incRapm=true&maxRank=400&minRank=0&rapmDiagMode=base&showOnOff=false&team=Missouri%20St.&teamLuck=true&year=2020%2F21&
http://localhost:3000/PlayerLeaderboard?filter=Gaige&gender=Men&tier=High&year=2020%2F21&
(note that on/off page seems consistent with leaderboard?!)

His RAPM in the leaderboard was been splatted

Ah so when I run in diag mode, it starts at an earlier lambda and then picks the 2nd iteration - ah looks like it is now always picking the 2nd iteration (actually I noticed that in the unit test but assumed that was legit :( )

combinedRapm[PRE] = [6.9] vs actualEff = [6.1] ... Err[PRE] = [0.8] Err[POST] = [2.2]

Alex-At-Home / cbb-on-off-analyzer

RAPM phases 2+ #66