Closed gngdb closed 9 years ago
Started with 192 channels in another layer in the middle, in exactly the same architecture as the ImageNet paper. Also, watching most recent instead of best to see how it trains. Pending results.
Submitted results from this model and improved our score by 0.012. Now investigating different dropout values. Set the same value x for all but last convolutional layer and 0.5 for the last convolutional layer (inspired by Srivastava et al. paper about dropout). Currently testing three models with x = 0.75, 0.9, 0.5.
Models with the tested dropout values didn't improve the current best score. So far, the best score was given by the extra convolutional layer alexnet-based model with 8-augmentation (alexnet_based_extra_convlayer.json).
As the network doesn't yet diverge after a large number of training epochs it could be worth trying to increase the capacity. Another reason is to better match the capacity of AlexNet. Yet another reason is that Matt only coded it with fewer layers to avoid filling the RAM of Tom's machine.