Currently, the training process in stove requires the v-fold ('v' in models functions) parameter to be set and it's minimum value is 2. However, since the workflow cannot be applied to data with a small number of rows, an additional workflow that does not perform CV must be implemented.
Cross-validation is an essential step in the workflow and is always implemented due to the following reasons:
The 'stove' package has an 'AutoML' concept that automatically optimizes the hyperparameters of each model.
Hyperparameter optimization aims to create a generalizable model that does not overfit the training data and performs well on new datasets.
To achieve this goal, hyperparameter optimization needs to be combined with logic that prevents overfitting, and cross-validation is the process that achieves this goal.
Furthermore, the amount of data required for ML model training depends on the complexity of the task and learning algorithm, usually requiring hundreds or more data.
However, if too little data is used for training, the model may not perform well. To prevent this, we recommend using at least 1,000 data points, as mentioned in the README.md file.
Currently, the training process in stove requires the v-fold ('v' in models functions) parameter to be set and it's minimum value is 2. However, since the workflow cannot be applied to data with a small number of rows, an additional workflow that does not perform CV must be implemented.