Closed Alex-At-Home closed 5 years ago
Well it's easy to find a possession discrepancy, eg take the first game Delaware (https://kenpom.com/box.php?g=58): 73-67 in 72 possessions
My aggregating lineup stats instead have 73-67 in 87 possessions, so there must be some situation I'm consistently getting wrong
Find that and then go through some more games I guess :(
Notes
{
"opponent": "18:04:00,2-5,Ryan Johnson, steal",
"opponent_possession": 2
},
{
"team": "18:04:00,2-5,Bruno Fernando, turnover badpass",
"team_possession": 3
},
Steal gets tagged onto the wrong possession here, which is currently harmless
Similarly:
{
"opponent": "14:23:00,12-14,Ryan Johnson, foul personal shooting;2freethrow",
"opponent_possession": 9
},
{
"team": "14:23:00,12-14,Bruno Fernando, foulon",
"team_possession": 10
},
Ah OK I see one problem ... if a lineup change occurs before the possession is complete it will get double counted
Example... end of 3rd lineup:
{
"team": "11:22:00,16-17,Bruno Fernando, foulon",
"team_possession": 1
},
{
"opponent": "11:22:00,16-17,Matt Veretto, foul personal",
"team_possession": 1
}
start of next lineup:
{
"team": "11:19:00,16-17,Bruno Fernando, rebound offensive",
"team_possession": 1
},
{
"team": "11:18:00,16-19,Bruno Fernando, 2pt dunk 2ndchance;pointsinthepaint made",
"team_possession": 1
},
OK so let's fix that and see where we end up...
Here's another one:
RawGameEvent(Some("08:48:00,20-23,Serrel Smith Jr., rebound defensive"), None, Some(5), None),
RawGameEvent(None, Some("08:44:00,20-23,Jacob Cushing, steal"), None, Some(5)),
RawGameEvent(Some("08:44:00,20-23,Serrel Smith Jr., turnover lostball"), None, Some(6), None),
A steal like a block needs to be ignored, we'll flip on the actual offensive action...
Timeout!
RawGameEvent(Some("04:04:00,26-33,Aaron Wiggins, assist"), None, Some(2), None),
RawGameEvent(None, Some("04:04:00,26-33,Team, timeout short"), None, Some(3)),
RawGameEvent(Some("04:04:00,26-33,Anthony Cowan, 2pt layup fromturnover;pointsinthepaint;fastbreak made"), None, Some(3), None),
duh offensive foul
RawGameEvent(None, Some("02:33:00,27-38,Eric Carter, foulon"), None, Some(1)),
RawGameEvent(None, Some("02:28:00,27-38,Ryan Johnson, foul offensive"), None, Some(1)),
RawGameEvent(Some("02:28:00,27-38,Jalen Smith, foulon"), None, Some(1), None),
RawGameEvent(None, Some("02:28:00,27-38,Ryan Johnson, turnover offensive"), None, Some(2)),
RawGameEvent(Some("02:15:00,27-39,Anthony Cowan, freethrow 1of2 fromturnover made"), None, Some(2), None),
Actually those foulon
are the real ones to ignore I think?
I have no idea what this means:
//prev event finishes with:
//RawGameEvent(Some("07:49:00,51-61,Darryl Morsell, freethrow 2of2 fastbreak;fromturnover made"), None, Some(5), None)
RawGameEvent(None, Some("07:49:00,51-60,Jacob Cushing, foul personal shooting;2freethrow"), None, None),
RawGameEvent(Some("07:45:00,51-61,Team, rebound offensivedeadball"), None, Some(1), None),
RawGameEvent(None, Some("07:43:00,51-61,Kevin Anderson, 3pt jumpshot missed"), None, Some(1)),
so that offensivedeadball
pulls the entire event into the next lineup event (incorrectly but that's a separate problem covered by another issue)
So actually what that means is that the problem must have occurred in the other event
(but this will be a problem in general when it does occur?)
Oh I understand what happened at least ... Morsell takes his second free throw and then gets subbed out. So this should be fine in practice
Analysis of 20 teams following first wave of fixes:
(me vs KP)
Delaware - correct
Navy - 70 vs 71 (i think that's just team vs opp possession)
NCAT - 71 vs 70
Hofstra - missing!
MSM - 73 vs 74
Virginia - missing!
* PSU - 67 vs 64
@Purdue (missing records)
Loyola - 63 vs 61
Loyola MD - missing!
Seton Hall - missing!
* Radford - 68 vs 65
Neb - 66 vs 65
@ Rut - 74 vs 72
** Minn - 70 vs 66
Indiana - 66 vs 65
*** Wis - 66 vs 61
@ Ohio St - correct
Mich St - 63 vs 62
v Illinois - correct
NW - correct
@Wisc - 62 vs 61
** @Neb - 69 vs 65
Purdue - 67 vs 65
@ Mich - correct
@Iowa - missing!
Oh St - 66 vs 64
@PSU (missing records)?
Mich - correct
Minn - 66 vs 65
v Neb - 65 vs 64
Belmont - missing!
LSU - 71 vs 72
Looking at the Wisconsin-UMD game (discrepancy of 5):
RawGameEvent(None, Some("12:32:00,10-17,Team, rebound offensive team"), None, Some(3)),
RawGameEvent(None, Some("12:32:00,10-17,Brevin Pritzl, substitution in"), None, Some(3)),
RawGameEvent(None, Some("12:32:00,10-17,D'Mitrik Trice, substitution out"), None, Some(3)),
RawGameEvent(Some("12:26:00,10-17,Team, rebound defensive team"), None, Some(3), None),
RawGameEvent(None, Some("12:26:00,10-17,Brad Davison, 2pt jumpshot 2ndchance missed"), None, Some(4)),
RawGameEvent(None, Some("12:25:00,10-17,Aleem Ford, foul personal"), None, Some(4)),
so MD's rebound gets registered before the shot, so gets logged as a change in possession
Options:
I think the second one here might be nicest?
Added that logic, but it's still totally broken, here's the diff:
Map(
(TeamSeasonId(TeamId("Hofstra"), Year(2018)), Home) -> (5, 5),
(TeamSeasonId(TeamId("Wisconsin"), Year(2018)), Home) -> (4, 4),
(TeamSeasonId(TeamId("Iowa"), Year(2018)), Away) -> (2, 2),
(TeamSeasonId(TeamId("Navy"), Year(2018)), Away) -> (0, 0),
(TeamSeasonId(TeamId("Michigan"), Year(2018)), Away) -> (1, 1),
(TeamSeasonId(TeamId("Penn St."), Year(2018)), Home) -> (3, 3),
(TeamSeasonId(TeamId("Illinois"), Year(2018)), Neutral) -> (2, 2),
(TeamSeasonId(TeamId("Minnesota"), Year(2018)), Home) -> (2, 3),
(TeamSeasonId(TeamId("Northwestern"), Year(2018)), Home) -> (1, 1),
(TeamSeasonId(TeamId("Ohio St."), Year(2018)), Home) -> (1, 2),
(TeamSeasonId(TeamId("Loyola Chicago"), Year(2018)), Neutral) -> (1, 1),
(TeamSeasonId(TeamId("Delaware"), Year(2018)), Home) -> (3, 3),
(TeamSeasonId(TeamId("Belmont"), Year(2018)), Neutral) -> (2, 2),
(TeamSeasonId(TeamId("Virginia"), Year(2018)), Home) -> (1, 1),
(TeamSeasonId(TeamId("Nebraska"), Year(2018)), Home) -> (2, 2),
(TeamSeasonId(TeamId("Loyola Maryland"), Year(2018)), Home) -> (2, 2),
(TeamSeasonId(TeamId("Purdue"), Year(2018)), Away) -> (0, 0),
(TeamSeasonId(TeamId("Minnesota"), Year(2018)), Away) -> (4, 4),
(TeamSeasonId(TeamId("N.C. A&T"), Year(2018)), Home) -> (2, 2),
(TeamSeasonId(TeamId("Wisconsin"), Year(2018)), Away) -> (1, 1),
(TeamSeasonId(TeamId("Michigan St."), Year(2018)), Away) -> (0, 1),
(TeamSeasonId(TeamId("Radford"), Year(2018)), Home) -> (0, 0),
(TeamSeasonId(TeamId("Seton Hall"), Year(2018)), Home) -> (5, 5),
(TeamSeasonId(TeamId("LSU"), Year(2018)), Neutral) -> (1, 1),
(TeamSeasonId(TeamId("Nebraska"), Year(2018)), Neutral) -> (2, 2),
(TeamSeasonId(TeamId("Ohio St."), Year(2018)), Away) -> (1, 1),
(TeamSeasonId(TeamId("Mount St. Mary's"), Year(2018)), Home) -> (4, 4),
(TeamSeasonId(TeamId("Marshall"), Year(2018)), Home) -> (7, 6),
(TeamSeasonId(TeamId("Indiana"), Year(2018)), Home) -> (1, 1),
(TeamSeasonId(TeamId("Michigan"), Year(2018)), Home) -> (0, 0),
(TeamSeasonId(TeamId("Nebraska"), Year(2018)), Away) -> (2, 2),
(TeamSeasonId(TeamId("Penn St."), Year(2018)), Away) -> (0, 1),
(TeamSeasonId(TeamId("Rutgers"), Year(2018)), Away) -> (3, 3),
(TeamSeasonId(TeamId("Purdue"), Year(2018)), Home) -> (2, 2)
)
(TeamSeasonId(TeamId("Wisconsin"), Year(2018)), Home) -> (4, 4)
so that did address it (still off by 1) there, but eg Delaware was right and now is wrong :(
I think the new plan has got to be to change completely how I calc possessions, by looking for end of possessions
The approach I think I like is:
I think the rules are something like:
The total numbers for 2018/9 (missing 9 lineup events/712) came out as:
2,375
possessions2,236
so O/100 =94
1,948
so D/100 =82
But KenPom raw ratings has us at
107.1
and98.8
Sports ref has us as
2429
(+193)2228
(+280)6800
minutes (aka34
games which is what I have), which at KP tempo of66.0
gives2244
possessions (-122)(Which leads to O/100 =
108
and D/100 =99
so basically in agreement with KP)So I somehow have too many possessions but not enough points :/