hkenawi / FPL-optimizer

This repository contains the source code for an optimizer model that evaluates a user's team and recommends optimal decision making regarding transfers and chip usage
2 stars 0 forks source link

Clean xPts model data #3

Closed hkenawi closed 2 months ago

hkenawi commented 3 months ago

Clean each stacked dataset:

  1. Player offensive historical data

  2. Player match-log data

  3. Team defensive data

  4. Team standard and advanced goalkeeper data

  5. Merge all cleaned data into player match-log data by using the necessary keys:

    • Team and season for team data
    • Player and season for player data

Reminder to split the data into train/test split before interpolation of missing values or feature scaling

hkenawi commented 3 months ago

If doing this, create a new branch in the repo and when done developing, send a PR into main subject to review

hkenawi commented 3 months ago

With regards to the training/testing split, I believe we can train on all the current data we have and just test on 2024 results. This would give us just below 20% testing data which is perfectly fine + simplifies the process of feature engineering.

hkenawi commented 3 months ago

The following have had their base infrastructure built and can be considered near completion/completed: Consolidated FBRef Player Offensive Historical Data Consolidated FBRef Player Match Log Data

hkenawi commented 2 months ago

Player-level data processing complete. Team-level data processing complete.

Closing issue.