Best checkpoint metric - Githubissues

The best checkpoint metric has now been added to the task configs.

If the metric is a list of per-class values, we use the value of the foreground class when the number of classes is 1 and the average of the per-class values otherwise (as discussed).

The only awkward thing is retrieving the number of classes from the dataset. I had to include a separate clause for the limited-label scenario since the instance variable 'dataset' changes from a 'Dataset' object to a 'Subset' object when we are not training on the full dataset.

yurujaja / pangaea-bench

Best checkpoint metric #87