fusr / Formula-1-Project

MIT License
0 stars 1 forks source link

cross validation on all ML models #42

Open fusr opened 9 months ago

fusr commented 9 months ago

splitting data set:

Data Splitting: The first step is to split your dataset into two or more subsets: a training set and a testing (or validation) set. The training set is used to train the model, while the testing set is used to evaluate its performance. However, in cross-validation, you typically split the data into multiple subsets, often referred to as "folds."

K-Fold Cross-Validation: The most common form of cross-validation is k-fold cross-validation, where the data is divided into 'k' equal-sized subsets or folds. The model is trained and evaluated 'k' times, each time using a different fold as the testing set and the remaining folds as the training set. The performance scores (e.g., accuracy, mean squared error) from each fold are then averaged to provide an overall performance estimate.

more info in the hands on machine learning notebook. Ana will send the link on slack.