Clean xPts model data - Githubissues

hkenawi / FPL-optimizer

This repository contains the source code for an optimizer model that evaluates a user's team and recommends optimal decision making regarding transfers and chip usage

2 stars 0 forks source link

Clean xPts model data #3

Closed hkenawi closed 2 months ago

hkenawi commented 3 months ago

Clean each stacked dataset:

Player offensive historical data
Player match-log data
Team defensive data
Team standard and advanced goalkeeper data
Merge all cleaned data into player match-log data by using the necessary keys:
- Team and season for team data
- Player and season for player data

Reminder to split the data into train/test split before interpolation of missing values or feature scaling

hkenawi commented 3 months ago

If doing this, create a new branch in the repo and when done developing, send a PR into main subject to review

hkenawi commented 3 months ago

With regards to the training/testing split, I believe we can train on all the current data we have and just test on 2024 results. This would give us just below 20% testing data which is perfectly fine + simplifies the process of feature engineering.

hkenawi commented 3 months ago

The following have had their base infrastructure built and can be considered near completion/completed: Consolidated FBRef Player Offensive Historical Data Consolidated FBRef Player Match Log Data

hkenawi commented 2 months ago

Player-level data processing complete. Team-level data processing complete.

Closing issue.