ybracke / transnormer

A lexical normalizer for historical spelling variants using a transformer architecture.
GNU General Public License v3.0
6 stars 1 forks source link

Evaluation: Trace metrics/predictions to train and test parameters #71

Open ybracke opened 10 months ago

ybracke commented 10 months ago

Generated predictions and evaluation metrics should be traceable to the parameters they were created with, i.e. model, training data and test data.

This information is stored in the files train_config.toml and test_config.toml. What we need is a way to link the predictions and evaluation results to the config files they are based on.

In the simplest form, each row in the metrics file (e.g. called eval.jsonl) would include a reference to the test_config.toml file that it is linked to. (This assumes that the model directory referenced in test_config.toml contains in turn a copy of train_config.toml at the time of training. Thus we have linked both the training and the test parameters to the evaluation results.) A row in the metrics file then would look like this:

{"test_config": "hidden/test_configs/544bd0c2.toml", "n": 100, "acc_harmonized": 0.95, "dist_normalized": 0.1} 

Here we would have a directory that serves as an archive, where each test_config.toml that was ever used is stored under a unique number. Thus, unless we hide the archived configs from git (and thereby do not save them or make them redistributable), we have to make a new git commit for each evaluation we run. To create the unique filename, run:

filename=`md5sum test_config.toml | head -c 8`.toml
cp test_config.toml hidden/test_configs/$filename
ybracke commented 10 months ago

DVC

Eventually, we should use a stage evaluate (or so) in dvc (see here and here)