I want to the train and validation set loss from training to be included in the table produced by dvc exp show
For that I need an eval folder that contains these values (e.g. in a JSON file) and that is updated with every experiment.
The file(s) in the eval should be created with a python script located at src/transnormer/evaluation.
The script should be called in a separate stage of dvc.yaml and listed as metric
First idea for the script:
Look into the models/model directory, select the last checkpoint
Load the log_history and select the loss from the final train batch and final eval_loss
Write them to eval/metrics.json
Problem with that idea: It only works for the logs created by the transformers.Trainer(), which typically does not include the final model. In order to have such results for the final model, I'd have to change the logging within train() in train_model.py and customize the computation and saving of scores (see my native pytorch scripts for a first step).
dvc exp show
eval
folder that contains these values (e.g. in a JSON file) and that is updated with every experiment.eval
should be created with a python script located atsrc/transnormer/evaluation
.stage
ofdvc.yaml
and listed as metricmodels/model
directory, select the last checkpointeval/metrics.json
transformers.Trainer()
, which typically does not include the final model. In order to have such results for the final model, I'd have to change the logging withintrain()
intrain_model.py
and customize the computation and saving of scores (see my native pytorch scripts for a first step).