alexgkendall / caffe-posenet

Implementation of PoseNet
Other
496 stars 204 forks source link

Trials to reproduce the results in the paper using SGD #2

Open lim0606 opened 8 years ago

lim0606 commented 8 years ago

Hi, this is Jaehyun Lim

I have a problems in learning rate scheduling to reproduce the results on the paper using sgd (as in the paper) (see https://github.com/lim0606/caffe-posenet-googlenet).

As far as I understood from the paper, it seems like that the model is trained until 100 epoch (or about to 100 epoch), i.e about 16 iterations for King's College dataset; however, it was very short for converge in my experience with the learning rate scheduling.

Therefore, I referred the maximum iterations for the datasets based on the adagrad in this repo, i.e. 30000, and it (fortunately) worked with King's College dataset. I got the results good enough compared to the record on the paper (actually better, see https://github.com/lim0606/caffe-posenet-googlenet).

However, I couldn't reproduce the results on the other datasets.

I would appreciate if you let me know some advises that I might miss or misunderstood from your paper.

Best regards,

Jaehyun

alexgkendall commented 8 years ago

Yes, typically the model converges after approx 30,000 iterations.

How close were your results to the paper for the other datasets? Also did you change the values of Beta? If you look at Fig 2 in the PoseNet paper, the model is quite dependant on a good choice of Beta.

Cheers, Alex

lim0606 commented 8 years ago

@alexgkendall

Thank you for your reply!

  1. I used the models in the link (http://mi.eng.cam.ac.uk/~agk34/resources/PoseNet.zip); thus, the betas are set as follows;
    • King's College: 500
    • Old Hospital: 1000
    • Shop Facade: 100
    • St Mary's Church: 250
    • Street: Median error 2000
  2. Results
    • King's College: Median error 1.88411343098 m and 2.33481308286 degrees
    • Old Hospital: Median error 2.78002953529 m and 2.94750986627 degrees
    • Shop Facade: Median error 2.2239086628 m and 4.27796134748 degrees
    • St Mary's Church: Median error 3.12559008598 m and 4.17480798865 degrees
    • Street: Median error 40.2640609741 m and 26.8703415628 degrees

Best regards,

Jaehyun

alexgkendall commented 8 years ago

Hi Jaehyun,

Sounds good to me then - are you also initialising your weights from a pretrained model on the Places dataset? (get weights here: https://github.com/BVLC/caffe/wiki/Model-Zoo)

Alex

lim0606 commented 8 years ago

Hi Alex,

Thank you for reply :)

I did use the pretrained model for the googlenet.

Best regards,

Jaehyun Lim

alykhantejani commented 8 years ago

Hi @lim0606 - did you manage to reproduce results with SGD?

Thanks, Aly

lim0606 commented 8 years ago

@alykhantejani

Hi @lim0606 - did you manage to reproduce results with SGD?

Yes and no...

I did only for King's College data (see https://github.com/lim0606/caffe-posenet-googlenet)

I think I have to run grid search to tune hyperparameters, i.e. stepsize, max iterations, gamma (a parameter of learning rate policy), and initial learning rate, for other data, but I couldn't have time to do it...

alykhantejani commented 8 years ago

@lim0606 Thanks for sharing the results!

alykhantejani commented 8 years ago

Hi @alexgkendall,

In the original paper it states

It was trained using stochastic gradient de- scent with a base learning rate of 10−5, 
reduced by 90% every 80 epochs and with momentum of 0.9. Using one half of a 
dual-GPU card (NVidia Titan Black), training took an hour using a batch size of 75

But the solver prototxt file (solver_posenet.prototxt) uses adagrad and base learning rate of 1e-3 with no decay. Is there a reason for this, did you find it trained faster/gave better results? Is this how the pre-trained models provided were trained?

Thanks, Aly

lidaweironaldo commented 8 years ago

Hi,

Has anyone managed to achieve the error reported in the paper for "street". I tried both the sgd and adagrad methods, but got errors about 10 times the value reported in the paper.

Best, Dawei

janosszabo commented 7 years ago

Hi @alexgkendall, I also tried to train Street with both sgd and adagrad, but I also could not reproduce the accuracy reported in the article (4m as far as i remember). Mine was also around 20-30m. But your pretrained caffemodel worked fine. Also when I trained Shopfacade it learned allright. Isn´t it possible that something is missing from this repo or some prototxt file is outdated? Thanks, Janos

janosszabo commented 7 years ago

Some more details: I used train_street.prototxt from your models zip file, along with the lmdb database I made using the script included in posenet/scripts (create_posenet_lmdb_dataset.py), from the contents of Street.zip, and I used posenet/models/solver_posenet.prototxt from the repo as the solver script. I initialized the GoogLeNet weights with the model pretrained on Places, available from here: http://vision.princeton.edu/pvt/GoogLeNet/. The distance error was around 20-30m when I tested the model after training, but your pretrained caffemodel worked as expected, it gave an error of 3-4m. @alexgkendall do you know why this could happen?

duongnamduong commented 7 years ago

Hi @alexgkendall , I have a question, I try re-train model using your dataset "KingsCollege". I configured my solver as follows:

net: "./model1/train_val_kingscollege_googlenet.prototxt" test_initialization: true test_iter: 11 test_interval: 300 base_lr: 1e-5 lr_policy: "step" gamma: 0.9 stepsize: 4000 display: 20 average_loss: 20 max_iter: 80000 solver_type: ADAGRAD weight_decay: 0.0002 snapshot: 1000 snapshot_prefix: "./model1/snapshots/" solver_mode: GPU

batch_size = 32. And I ran it. But I seen the result is not converge. Could you help me to explain that?

ming-c commented 6 years ago

@janosszabo Though it's been more than one year, but could you pls advise me that did your model of scene 'Street' coverage to PoseNet paper's result?

DRAhmadFaraz commented 5 years ago

@ming-c @janosszabo @lim0606 @alexgkendall @lidaweironaldo

Can some body please guide me how to train this model on custom RGB images from scratch.?

I will be Thankful to you. Regards