Closed bagustris closed 5 months ago
This fix MLP did not report the best metrics at the end of training #130 . Before PR:
EBUG runmanager: run 0 DEBUG model: value for loss not found, using default: cross DEBUG model: using model with cross entropy loss function DEBUG model: value for device not found, using default: cuda DEBUG model: using layers {'l1':128, 'l2':64} DEBUG model: value for learning_rate not found, using default: 0.0001 DEBUG model: value for num_workers not found, using default: 5 DEBUG modelrunner: run: 0 epoch: 0: result: test: 0.500 UAR DEBUG modelrunner: run: 0 epoch: 1: result: test: 0.500 UAR DEBUG modelrunner: run: 0 epoch: 2: result: test: 0.500 UAR DEBUG modelrunner: run: 0 epoch: 3: result: test: 0.500 UAR DEBUG modelrunner: run: 0 epoch: 4: result: test: 0.503 UAR DEBUG modelrunner: run: 0 epoch: 5: result: test: 0.509 UAR DEBUG modelrunner: run: 0 epoch: 6: result: test: 0.516 UAR DEBUG modelrunner: run: 0 epoch: 7: result: test: 0.518 UAR DEBUG modelrunner: run: 0 epoch: 8: result: test: 0.521 UAR DEBUG modelrunner: run: 0 epoch: 9: result: test: 0.520 UAR DEBUG modelrunner: run: 0 epoch: 10: result: test: 0.523 UAR DEBUG modelrunner: run: 0 epoch: 11: result: test: 0.522 UAR DEBUG modelrunner: run: 0 epoch: 12: result: test: 0.526 UAR DEBUG modelrunner: run: 0 epoch: 13: result: test: 0.523 UAR DEBUG modelrunner: run: 0 epoch: 14: result: test: 0.521 UAR DEBUG modelrunner: run: 0 epoch: 15: result: test: 0.518 UAR DEBUG modelrunner: run: 0 epoch: 16: result: test: 0.523 UAR DEBUG modelrunner: run: 0 epoch: 17: result: test: 0.526 UAR DEBUG modelrunner: run: 0 epoch: 18: result: test: 0.522 UAR DEBUG modelrunner: run: 0 epoch: 19: result: test: 0.521 UAR DEBUG modelrunner: plotting confusion matrix to train_dev_mlp_os_64-128_scale-st andard_0_019_cnf DEBUG reporter: epoch: 19, UAR: .52, (+-.508/.534), ACC: .961 DEBUG reporter: labels: [0, 1] DEBUG reporter: auc: 0.521, pauc: 0.520 DEBUG reporter: result per class (F1 score): [0.981, 0.073]
After this PR, all metrics report using the best model at specific epoch, not the last.
DEBUG modelrunner: run: 0 epoch: 0: result: test: 0.500 UAR DEBUG modelrunner: run: 0 epoch: 1: result: test: 0.500 UAR DEBUG modelrunner: run: 0 epoch: 2: result: test: 0.500 UAR DEBUG modelrunner: run: 0 epoch: 3: result: test: 0.500 UAR DEBUG modelrunner: run: 0 epoch: 4: result: test: 0.502 UAR DEBUG modelrunner: run: 0 epoch: 5: result: test: 0.504 UAR DEBUG modelrunner: run: 0 epoch: 6: result: test: 0.513 UAR DEBUG modelrunner: run: 0 epoch: 7: result: test: 0.517 UAR DEBUG modelrunner: run: 0 epoch: 8: result: test: 0.522 UAR DEBUG modelrunner: run: 0 epoch: 9: result: test: 0.521 UAR DEBUG modelrunner: run: 0 epoch: 10: result: test: 0.525 UAR DEBUG modelrunner: run: 0 epoch: 11: result: test: 0.530 UAR DEBUG modelrunner: run: 0 epoch: 12: result: test: 0.526 UAR DEBUG modelrunner: run: 0 epoch: 13: result: test: 0.529 UAR DEBUG modelrunner: run: 0 epoch: 14: result: test: 0.531 UAR DEBUG modelrunner: run: 0 epoch: 15: result: test: 0.530 UAR DEBUG modelrunner: run: 0 epoch: 16: result: test: 0.530 UAR DEBUG modelrunner: run: 0 epoch: 17: result: test: 0.535 UAR DEBUG modelrunner: run: 0 epoch: 18: result: test: 0.529 UAR DEBUG modelrunner: run: 0 epoch: 19: result: test: 0.528 UAR DEBUG modelrunner: plotting confusion matrix to train_dev_mlp_os_64-128_scale-st andard_0_019_cnf DEBUG reporter: Best score at epoch: 17, UAR: .534, (+-.52/.55), ACC: .965 DEBUG reporter: labels: [0, 1] DEBUG reporter: auc: 0.535, pauc: 0.534 from epoch: 17 DEBUG reporter: result per class (F1 score): [0.982, 0.114] from epoch: 17 WARNING experiment: Save experiment: Can't pickle the trained model so saving wi thout it. (it should be stored anyway) DEBUG experiment: Done, used 100.017 seconds DONE
For finetuning, there is no need to find the best epoch since the argument load_best_model_at_end is given in training args.
load_best_model_at_end
This fix MLP did not report the best metrics at the end of training #130 . Before PR:
After this PR, all metrics report using the best model at specific epoch, not the last.
For finetuning, there is no need to find the best epoch since the argument
load_best_model_at_end
is given in training args.