minerva-ml / minerva-training-materials

Learn advanced data science on real-life, curated problems
https://neptune.ml/minerva
MIT License
48 stars 14 forks source link

[whales, task 3 output] what is validation/test score? #85

Open pknut opened 6 years ago

pknut commented 6 years ago

The validation score (2.0059) is not equal to validation loss (1.01667) or the validation accuracy (0.77713). Similarly, test score is hard to interpret. How are these two scores calculated?

226894.311837 | 2018-04-15 01-09-37 minerva >>> epoch 250 current lr: 0.0003252930814335209 226894.312173 | 2018-04-15 01-09-37 minerva >>> epoch 249 loss: 0.03353 226894.312389 | 2018-04-15 01-09-37 minerva >>> epoch 249 accuracy: 0.99986 226981.769858 | 2018-04-15 01-11-05 minerva >>> epoch 249 validation loss: 1.01667 226981.770167 | 2018-04-15 01-11-05 minerva >>> epoch 249 validation accuracy: 0.77713 227067.955128 | 2018-04-15 01-12-31 minerva >>> training finished... <...> 227715.884304 | Validation score is 2.0059 227715.884506 | Test score is 2.1295 227715.884696 | That is a solid validation 227715.884888 | Sorry, but this score is not high enough to pass the task

pziecina commented 6 years ago

Seems that sometimes it is RMSE and sometimes it is log loss - depends on pipeline stage/subproblem: See minerva/whales/validation.py

SCORE_FUNCTIONS = {'localization': rmse_multi,
                   'alignment': rmse_multi,
                   'classification': log_loss_whales,
                   'end_to_end': log_loss_whales
                   }

Results for which subproblem you've pasted above?