PHNX-MOD / Machine_Learning_sport_Model

A study of a basketball model with a focus on Quarter Performance
https://vh3ewn-phnx0mod.shinyapps.io/cbbmodelrshiny/
Other
2 stars 0 forks source link

Step 2 Data Preprocessing #12

Closed PHNX-MOD closed 1 year ago

PHNX-MOD commented 1 year ago

Merge or join the box scores data with fixture information to create a comprehensive dataset that includes both match statistics and fixture details. You can use the merge function or other data manipulation functions like dplyr for this purpose.

PHNX-MOD commented 1 year ago
  1. Create Features:
PHNX-MOD commented 1 year ago
  1. Create Features:

  2. Shooting Percentages:

    • Calculate the Field Goal Percentage (FG%) for each team. FG% is the ratio of successful field goals (2PM + 3PM) to total field goal attempts (2PA + 3PA).
    • Calculate the Three-Point Percentage (3P%) for each team. 3P% is the ratio of successful three-pointers (3PM) to total three-point attempts (3PA).
    • Calculate the Free Throw Percentage (FT%) for each team. FT% is the ratio of successful free throws (FTM) to total free throw attempts (FTA).
  3. Rebound Differential:

    • Calculate the Offensive Rebound Differential (ORD) for each team. ORD is the difference between the average offensive rebounds (ORB) a team secures and the average offensive rebounds their opponents secure.
    • Calculate the Defensive Rebound Differential (DRD) for each team. DRD is the difference between the average defensive rebounds (DRB) a team secures and the average defensive rebounds their opponents secure.
  4. Assist-to-Turnover Ratio:

    • Calculate the Assist-to-Turnover Ratio (AST/TOV) for each team. This ratio measures a team's ball-handling efficiency. It's the ratio of assists (AST) to turnovers (TOV).
  5. Steal and Block Averages:

    • Calculate the average number of steals (STL) and blocks (BLK) for each team. These metrics can represent a team's defensive capabilities.
  6. Foul Differential:

    • Calculate the average difference in the number of fouls committed (PF) between a team and its opponents. This can indicate a team's discipline on the court.
  7. Historical Performance:

    • Consider incorporating historical performance metrics. For example, calculate the team's win-loss record over a certain number of previous games.
  8. Home Court Advantage:

    • If applicable, include a binary feature that indicates whether the game was played at the home court of one of the teams. Home court advantage can significantly impact game outcomes.
  9. Day of the Week:

    • Extract the day of the week from the date and include it as a categorical feature. Some teams may perform differently on specific days.
  10. Opponent Strength:

    • Consider including a feature that measures the strength of the opponent. This can be based on the opponent's win-loss record, rankings, or other relevant metrics.
  11. Time Since Last Game:

    • Calculate the number of days since each team's last game. Fatigue can play a role in performance.
PHNX-MOD commented 1 year ago

Shooting Percentages added in the SQL query

ROUND((CAST(X2PM ROUND((CAST(X2PM AS REAL) / CAST(X3PA AS REAL))100,2) AS'3P%', ROUND((CAST(FTM AS REAL) / CAST(FTA AS REAL))100,2) AS'FT%'

@Shmulvi next step is Rebound Differential:

PHNX-MOD commented 1 year ago

NEXT THING TO DA WITH

Historical Performance: Day of the Week: Opponent Strength: Time Since Last Game: