We want to change the way we calculate the median values for pit stop and tyre related columns.
Calculate the median based on the training set and fill into the test set collumns to avoid future data usage, data leakage.
we want to specify the median values based on the circuitId -> median pit stop and tyre data should be calculated for every circuitId and if there is a value in the test set in those columns, we want to replace it with the matching median value -> take the median value for matching circuitID
We want to change the way we calculate the median values for pit stop and tyre related columns. Calculate the median based on the training set and fill into the test set collumns to avoid future data usage, data leakage. we want to specify the median values based on the circuitId -> median pit stop and tyre data should be calculated for every circuitId and if there is a value in the test set in those columns, we want to replace it with the matching median value -> take the median value for matching circuitID