Open scottclowe opened 9 years ago
I'm running models which I have tried to optimise to run faster and I'm comparing
Shorthand: N(mu,sigma) = normal distribution
Translation seemed to be a bit of a hindrance, whereas shear and scale were not.
In all cases, augmentations prevent overfitting - keeping train_nll
at the same level as valid_nll
.
Comparing transform_order
, it seems than the default of 0.5 is slightly better than using linear interpolation (1.0) after the same number of epochs. However, training on the test 10% of data it is ~5% slower as well.
The difference in duration on the train 80% of data will be much larger, so maybe we should train with transform_order:1
for the speed benefits now we are short on time... We could use transform_order:1
for the bulk of the training and then transform_order:0.5
for fine tuning at the end of training?
The final difference was a logloss validation of about 1.6 vs 1.9 .
The set of augmentations I rustled up from "looking at what was generated by eye" keeps doing worse than online 8 aug.
We need to work if one of the augmentations is hindering the learning.
Generally, we need to find the best set of augmentations to use with this dataset.