pjkirsch / gtsrb

Traffic Sign Recognition with Convolutional Neural Networks.
0 stars 0 forks source link

Approach for the Traffic Sign Recognition challenge #1

Open pjkirsch opened 9 years ago

pjkirsch commented 9 years ago

Here is the approach I chose for the Traffic Sign Recognition. It is mostly inspired from the approach suggested in paper A: "Traffic Sign Recognition with Multi-Scale Convolutional Networks", by Yann LeCun et al.

SGD is used for optimization.

1) Data preparation : Follow the same approach as 'paper A' for preprocessing and validation set extraction 2) Train a toy MLP (No convolutional layer) with images directly used as input features, 1 hidden layer and a softmax layer to check the scripts EDIT: 2bis) Check the assumption from "paper A" that color info does not provide any improvement. If verified, keep only greyscale image for further tests. 3) Design a small Convolutional Neural Network. 4) Tune the learning rate 5) Increase the number of kernels and see the influence on results 6) Tune optimization hyperparameters 7) Go deeper (add convolutional lower layers, then full higher layers)

Potential ways of improvement:

pjkirsch commented 9 years ago

Report of first results:

All experiments have been done using standard SGD with a 10e-3 initial learning rate and a 10e-7 learning rate decay

Note: The "official score" is the accuracy achieved by the network on the test set at the epoch with the best score on validation set. The "best score" is the best accuracy achieved by the network on the test set over all epochs.

Exp1: mlp-toy1 Input: 32x32 YUV images Hidden layer 1: 30 tanh units Output layer: 43 softmax classifier Cost function: log-likelihood Official score: 71.78% after 8 epochs Best score: 85.71% after 9 epochs

Exp2: mlp-toy2 Input: 32x32 Y images (grey-scale) Hidden layer 1: 30 ReLU units Output layer: 43 softmax classifier Cost function: log-likelihood Official score: 87.40% after 19 epochs Best score: 87.40% after 19 epochs

Exp3: cnn1 Input: 32x32 Y images (grey-scale) Hidden layer 1: Conv. 8 kernels of size 1x5x5, tanh units, max-pooling Hidden layer 2: Conv. 8 kernels of size 8x3x3, tanh units, max-pooling Hidden layer 3: 30 ReLU units Output layer: 43 softmax classifier Cost function: log-likelihood Official score: 94.11% after 28 epochs Best score: ==> 94.41% after 27 epochs <==

Exp4: cnn2 (still running...) Input: 32x32 Y images (grey-scale) Hidden layer 1: Conv. 16 kernels of size 1x5x5, tanh units, max-pooling Hidden layer 2: Conv. 8 kernels of size 8x3x3, tanh units, max-pooling Hidden layer 3: 30 ReLU units Output layer: 43 softmax classifier Cost function: log-likelihood Official score: 93.70% after 19 epochs Best score: 93.84% after 17 epochs

According to the preliminary results from Exp4, more kernels for first hidden layer do not provide much improvements. Hypothesis 1: HL2 too small to take advantage from bigger HL1 --> provide more kernels to HL2 Hypothesis 2: HL1 learnt kernels are redundant --> Try Dropout technique Hypothesis 3: Too much overfitting --> Try Dropout + other regularizations (L2...)

Note: Experiments on cnn1 to tune hyperparameters indicate that an initial learning rate of 10e-2 would be better (faster convergence).

pjkirsch commented 9 years ago

Results after a week end of long computations:

Exp5: cnnDropOut1 Input: 32x32 Y images (grey-scale) Hidden layer 1: Conv. 16 kernels of size 1x5x5, tanh units, max-pooling Hidden layer 2: Conv. 16 kernels of size 8x3x3, tanh units, max-pooling Hidden layer 3: 60 ReLU units Output layer: 43 softmax classifier Drop-out: Before each hidden layer, 0.5 drop-out probability Cost function: log-likelihood Optimization: learning rate 1e-3, decay 1e-7, no momentum Official score: 89.78% after 5 epochs Best score: 89.78% after 5 epochs

Exp6: cnnDropOut1 Input: 32x32 Y images (grey-scale) Hidden layer 1: Conv. 16 kernels of size 1x5x5, tanh units, max-pooling Hidden layer 2: Conv. 16 kernels of size 8x3x3, tanh units, max-pooling Hidden layer 3: 60 ReLU units Output layer: 43 softmax classifier Drop-out: Before each hidden layer, 0.5 drop-out probability Cost function: log-likelihood Optimization: learning rate 1e-2, decay 1e-7, no momentum Official score: 71.40% after 1 epoch Best score: 71.40% after 1 epoch

Exp7: cnnDropOut1 Input: 32x32 Y images (grey-scale) Hidden layer 1: Conv. 16 kernels of size 1x5x5, tanh units, max-pooling Hidden layer 2: Conv. 16 kernels of size 8x3x3, tanh units, max-pooling Hidden layer 3: 60 ReLU units Output layer: 43 softmax classifier Drop-out: Before each hidden layer, 0.5 drop-out probability Cost function: log-likelihood Optimization: learning rate 1e-3, decay 1e-7, with momentum 0.9 Official score: 95.27% after 35 epoch Best score: 95.69% after 43 epoch

Exp8: cnnDropOut1 Input: 32x32 Y images (grey-scale) Hidden layer 1: Conv. 16 kernels of size 1x5x5, tanh units, max-pooling Hidden layer 2: Conv. 16 kernels of size 8x3x3, tanh units, max-pooling Hidden layer 3: 60 ReLU units Output layer: 43 softmax classifier Drop-out: Before each hidden layer, 0.5 drop-out probability Cost function: log-likelihood Optimization: learning rate 1e-3, decay 1e-6, with momentum 0.9 Official score: 95.04% after 53 epoch Best score: 95.22% after 52 epoch

Exp9: cnnDropOut2 Input: 32x32 Y images (grey-scale) Hidden layer 1: Conv. 32 kernels of size 1x5x5, tanh units, max-pooling Hidden layer 2: Conv. 32 kernels of size 8x3x3, tanh units, max-pooling Hidden layer 3: 60 ReLU units Output layer: 43 softmax classifier Drop-out: Before each hidden layer, 0.6 drop-out probability Cost function: log-likelihood Optimization: learning rate 1e-3, decay 1e-7, with momentum 0.9 Official score: 96.09% after 85 epoch Best score: ==> 96.52% after 93 epoch <==

Notes: