Closed johann-petrak closed 3 years ago
Holdout estimation, with doc_classification_holdout.py
example script, 5-fold, stratified, 0.8:
HOLDOUT Accuracy: mean 0.9260386190754827 stdev 0.02326096390235921
HOLDOUT F1 MICRO: mean 0.9260386190754827 stdev 0.02326096390235921
HOLDOUT F1 MACRO: mean 0.9186748352877108 stdev 0.0245897094179357
HOLDOUT F1 OFFENSE: mean 0.8942415341996025 stdev 0.030496627391813518
HOLDOUT F1 OTHER: mean 0.943108136375819 stdev 0.018691519027049935
HOLDOUT MCC: mean 0.8386780795230324 stdev 0.047554615774727346
This is quite a bit better than when running the crossvalidation sample. Maybe because we also do dev-set stratification here?
fixed by #825
Basically the same as we have now for cross validation, implement
DataSiloForHoldout
with class methodmake(cls, datasilo, sets=["train", "dev", "test"], n_splits=5, shuffle=True, train_split=0.7, stratified=True)
The advantage of holdout estimatation is that the training and test sizes do not depend on the number of splits and it is possible in some situations to get generalization estimates more efficiently.