libornovax / master_thesis_code

Code for my master thesis: Vehicle Detection and Pose Estimation for Autonomous Driving
MIT License
187 stars 69 forks source link

Proposed enhancements #16

Closed libornovax closed 7 years ago

libornovax commented 7 years ago

Try these:

libornovax commented 7 years ago

Increasing the number of parameters

As shown below, increasing (approximately doubling) the number of parameters improved the performance of the network, when the same learning setup was used.

Original network learning_curves More parameters learning_curves More parameters, uniform lr = 0.0005 learning_curves

Note: We can see stagnation after 5000 training iterations. That is because at that point we decreased the learning rate by the factor of 10, which was not a good idea as the gradient is already pretty small. That is why I ran one more test with uniform learning rate, which seems to be better.

libornovax commented 7 years ago

Uniform and higher learning rate

Using uniform learning rate instead of decreasing it showed to be better! Also higher learning rate was favorable.

lr = 0.0001, decreasing every 5000 iterations learning_curves lr = 0.0001, fixed learning_curves lr = 0.0005, fixed learning_curves

libornovax commented 7 years ago

More convolutional layers

I changed the network to have more convolutional layers, but the same field of view. Here is the model:

macc_0.25_r2_x4
r2 c0.25
conv k3      o64
conv k3      o64
pool
conv k3      o128
conv k3  d2  o128
conv k3      o128
pool
conv k3      o256
conv k3  d1  o256
conv k3  d3  o256
conv k3  d5  o256
conv k3      o256
macc x4

Original network, decreasing learning rate learning_curves More convolutional layers, decreasing learning rate learning_curves More convolutional layers, fixed lr = 0.0005 learning_curves

It seems that adding more convolutional layers helps, but apparently we almost reached some bottom since the higher learning rate does not improve the performance too much.

libornovax commented 7 years ago

Comparison of training with lr = 0.0005

I show 4 plots. All of those trainings are ran with the same setup lr = 0.0005, 20000 iterations, batch size 32. KITTI train and Jura test short as validation (as in all tests). The plots compare the influence of introduced changes to the network. Also, I made one more run with using leaky ReLu units.

learning_curves learning_curves learning_curves

Leaky ReLu

learning_curves

Note: We see that the validation error is basically the same in all trainings. However introducing more conv layers or more parameters increases the networks' ability to fit to the training data. As can be seen, the more conv and more param networks "fit" more to the training data - they have a lower loss on the training set.

It also seems that we have reached some minimum value of loss on the validation set as we are not able to decrease the error by doing any of the changes that we introduced.