Problems about training and image and saliency map normalization

imatge-upc / saliency-2016-cvpr

Shallow and Deep Convolutional Networks for Saliency Prediction

http://imatge-upc.github.io/saliency-2016-cvpr/

MIT License

185 stars 67 forks source link

Problems about training and image and saliency map normalization #9

Closed inkfish2016 closed 8 years ago

inkfish2016 commented 8 years ago

Follow the parameters setting in the paper and the training prototxt provided by kevinmcguinness in issue #3, i train the deep net based on VGG_CNN_M, but the net does not convergent. My training prototxt is: net:"train_deep_sal.prototxt" test_iter: 500 test_interval: 1000 display: 10 average_loss: 20 lr_policy: "step" gamma: 0.5 stepsize: 100 base_lr: 0.00000013 momentum: 0.9 iter_size: 1 max_iter: 24000 weight_decay: 0.0005 snapshot: 5000 snapshot_prefix: "train_deep_SalNet" solver_mode: GPU test_initialization: false Is it right?
In the training prototxt, saliency mean is 31 and scale is 2/255, can it resize the saliency map to [-1, 1]? It seems that ([0, 255]-31)*(2/255) dose not match [-1 1], and this will mark more regions salient?
In the post-process stage, the net_output add 127 but why not 31?

kevinmcguinness commented 8 years ago

Is the network diverging, or is it just that the loss is decreasing slowly?
Yes, it doesn't exactly scale values to [-1, 1], but this doesn't matter too much, since you can postprocess the values to stretch/clip them once the network is trained. For most benchmarks, the relative values that matter more than the absolute values, so the scaling is more to help ensure the gradients don't shrink or blow up.
Yes, I think this is due to me testing two different versions of the model, one where I just used 127 and the other when I used the sample mean (31). From what I recall, both methods produce similar results, so you can use either (just be consistent at train and prediction time).

inkfish2016 commented 8 years ago

No matter the fixed lr or step lr, the loss varies little, and it is always between 1000 and 3000. Are there some other parameters affect the net a lot?

kevinmcguinness commented 8 years ago

The train loss reported by caffe will be very noisy, since it is only computed on a small number of images. If you compute val loss periodically for a fairly large number of iterations (e.g. 500), this should be much more stable and be seen to decrease steadily over time.

solver.prototxt.txt

kevinmcguinness commented 8 years ago

FYI, losses between 3000 and 1000 are normal: the loss is not normalized by the number of pixels.

inkfish2016 commented 8 years ago

Thanks for your help, i get similar results in SALICON validation set as your model

Goutam-Kelam commented 6 years ago

Hi, I am trying to implement your paper in pytorch. I would like to know what You mean by 25000 iterations. Is it 25000 epochs or 5 epochs (since there are 10000 images in Salicon training set and ur batch size if 2 so 5000 iterations/ epoch).

kevinmcguinness commented 6 years ago

Hi @Goutam-Kelam. Yes, that would be 5 epochs

Goutam-Kelam commented 6 years ago

Thankyou for your prompt reply.

Goutam-Kelam commented 6 years ago

Hi @kevinmcguinness can you help me out with the new issue which I have created relating to DeepNet in pytorch.