Model does not give any feature importance to 'year' parameter

tmalik1116 / F1_Qualifying_Predictor_ML

Estimate Formula 1 qualifying results using ML

https://tmalik1116.github.io/F1_Qualifying_Predictor_ML/

3 stars 0 forks source link

Model does not give any feature importance to 'year' parameter #1

Open tmalik1116 opened 2 months ago

tmalik1116 commented 2 months ago

The first step towards fixing this will be the introduction of vastly more data in the training dataset. Currently the planned increase will be over 3x the number of laps.

tmalik1116 commented 2 months ago

After expanding the dataset to include a larger period of time, the model still does not consider the year. This is currently under investigation, more testing is required.

tmalik1116 commented 2 months ago

The model was further improved by adding a new parameter: "years_since_reg_changes". It can now correctly identify that lap times tend to get slower when new regulations are introduced, and end up getting faster throughout the regulation cycle. However there is no general trend of decreasing lap times, the model will assume every regulation cycle results in exactly the same car performance, and by extension lap times.

tmalik1116 commented 2 months ago

One of the next steps that may be explored to improve this issue is adding another parameter or two for each entry, those being the team and the team's WCC ranking by the end of the season (perhaps when the lap was recorded for more accurate picture of midseason performance). It can already make connections between certain teams in certain eras having different performance with the same drivers, however it is possible to make this correlation even stronger.

tmalik1116 commented 2 months ago

Some of the recent changes need to be examined, there is not much repeatability in the predictions for 2024 and 2025. Overall lap times still do not decrease as year increases, despite adding weight specifically to the year parameter. Currently under investigation.

tmalik1116 commented 2 months ago

With the addition of more years of data to the dataset (planned 2014-2017), the model should have sufficient data to notice a general trend of faster lap times over time, with the beginning of a new regulation cycle throwing off the progress in the short term.

tmalik1116 commented 1 month ago

I could incorporate a much larger selection of laps if I added all laps, regardless of driver. The main issue with this is that my current system for obtaining the average grid position for each driver is very tedious, I need to automate it somehow. (Look into doing this by making JSON requests to the statsf1.com website)