matsonj / nba-monte-carlo

Monte Carlo simulation of the NBA season, leveraging dbt, duckdb and evidence.dev
https://www.mdsinabox.com
MIT License
409 stars 63 forks source link

misc model updates #131

Open matsonj opened 9 months ago

matsonj commented 9 months ago

nfl

nba

matsonj commented 9 months ago

add the in-season analytics page in this branch: https://github.com/matsonj/nba-monte-carlo/pull/132

matsonj commented 9 months ago

moved evidence queries to sources on #133

matsonj commented 9 months ago

tiebreakers v1 https://github.com/matsonj/nba-monte-carlo/commit/4b3d1d36fa10274b15d7d5af5393c013822347bc

matsonj commented 9 months ago

fixing tiebreakers: https://github.com/matsonj/nba-monte-carlo/commit/ca5f64af26aa2d7ea4326e9097babfb12af5dcb7

I'm also adding some example code here because the head-to-head tiebreakers were very annoying for me to implement in SQL. However, with ChatGPT I was able to discover a good solution, and implement it very fast.

The sample code snippet is below which contains the logic for the h2h tiebreakers. You can copy+paste and run this directly as it is in harlequin or duckdb wasm, for example, to see for yourself.

-- Create table statement
CREATE TABLE games (
    game_id INT PRIMARY KEY,
    group_id VARCHAR(1),
    team1_id VARCHAR(3),
    team2_id VARCHAR(3),
    winner_id VARCHAR(3)
);

-- Commented out first set of INSERT statements
/*
INSERT INTO games (game_id, group_id, team1_id, team2_id, winner_id) VALUES
('1', 'A', 'A1', 'A2', 'A1'),
('2', 'A', 'A1', 'A3', 'A3'),
('3', 'A', 'A1', 'A4', 'A4'),
('4', 'A', 'A1', 'A5', 'A1'),
('5', 'A', 'A2', 'A3', 'A2'),
('6', 'A', 'A2', 'A4', 'A4'),
('7', 'A', 'A2', 'A5', 'A5'),
('8', 'A', 'A3', 'A4', 'A3'),
('9', 'A', 'A3', 'A5', 'A5'),
('10', 'A', 'A4', 'A5', 'A4');
*/

-- Second set of INSERT statements
INSERT INTO games (game_id, group_id, team1_id, team2_id, winner_id) VALUES
('1', 'A', 'A1', 'A2', 'A2'),
('2', 'A', 'A1', 'A3', 'A1'),
('3', 'A', 'A1', 'A4', 'A1'),
('4', 'A', 'A1', 'A5', 'A5'),
('5', 'A', 'A2', 'A3', 'A3'),
('6', 'A', 'A2', 'A4', 'A2'),
('7', 'A', 'A2', 'A5', 'A2'),
('8', 'A', 'A3', 'A4', 'A4'),
('9', 'A', 'A3', 'A5', 'A3'),
('10', 'A', 'A4', 'A5', 'A4');

-- Temporary tables and final SELECT statement
CREATE OR REPLACE TEMPORARY TABLE Wins AS
SELECT
    team_id,
    COUNT(*) AS total_wins
FROM (
    SELECT winner_id AS team_id FROM games WHERE group_id = 'A'
) AS WinningTeams
GROUP BY team_id;

CREATE OR REPLACE TEMPORARY TABLE HeadToHead AS
SELECT
    g.winner_id,
    CASE
        WHEN g.winner_id = g.team1_id THEN g.team2_id
        ELSE g.team1_id
    END AS opponent_id
FROM games g
WHERE g.group_id = 'A';

-- this is where the magic happens
CREATE OR REPLACE TEMPORARY TABLE HeadToHeadWins AS
SELECT
    h.winner_id AS team_id,
    COUNT(*) AS h2h_wins
FROM HeadToHead h
-- specifically right here, we only inspect the wins by team for each group of teams with the same number wins in each group.
INNER JOIN Wins w ON h.winner_id = w.team_id AND h.opponent_id IN (
    SELECT team_id FROM Wins WHERE total_wins = w.total_wins AND team_id != w.team_id
)
GROUP BY h.winner_id;

SELECT
    w.team_id,
    w.total_wins,
    COALESCE(h2h.h2h_wins, 0) AS head_to_head_wins,
    ROW_NUMBER() OVER (
        ORDER BY w.total_wins DESC, COALESCE(h2h.h2h_wins, 0) DESC, w.team_id
    ) AS rank
FROM Wins w
LEFT JOIN HeadToHeadWins h2h ON w.team_id = h2h.team_id
ORDER BY rank;
matsonj commented 8 months ago

fixing post-season: https://github.com/matsonj/nba-monte-carlo/commit/552352c63b20aa0bc99fb5d7585ca448b1310a98

we are still missing 4 games (results of in-season tournament knockout rounds) but I will add that when the games are added to the schedule.