Finetuning on KITTI failed

dongli12 commented 4 years ago

Dear authors,

I try to finetune your released fadnet on kitti 2015. I run "sh finetune.sh" with the default parameters but failed. Here is the part of log:

... Number of model parameters: 144303840 Iter 0 3-px error in val = 9.296 Iter 1 3-px error in val = 4.060 Iter 2 3-px error in val = 6.256 MIN epoch 0 of round 0 total test error = 6.537 ('self.multiScales: ', [AvgPool2d(kernel_size=1, stride=1, padding=0), AvgPool2d(kernel_size=2, stride=2, padding=0), AvgPool2d(kernel_size=4, stride=4, padd ing=0), AvgPool2d(kernel_size=8, stride=8, padding=0), AvgPool2d(kernel_size=16, stride=16, padding=0), AvgPool2d(kernel_size=32, stride=32, padding=0), AvgP ool2d(kernel_size=64, stride=64, padding=0)], ' self.downscale: ', 1) [0.32, 0.16, 0.08, 0.04, 0.02, 0.01, 0.005] 0.01 Iter 0 training loss = 7.274 , time = 1.30 Iter 1 training loss = inf , time = 0.38 Iter 2 training loss = inf , time = 0.32 Iter 3 training loss = inf , time = 0.33 Iter 4 training loss = inf , time = 0.34 Iter 5 training loss = inf , time = 0.36 Iter 6 training loss = inf , time = 0.32 Iter 7 training loss = inf , time = 0.32 Iter 8 training loss = inf , time = 0.36 Iter 9 training loss = inf , time = 0.33 Iter 10 training loss = inf , time = 0.39 Iter 11 training loss = inf , time = 0.33 Iter 12 training loss = inf , time = 0.34 Iter 13 training loss = inf , time = 0.33 Iter 14 training loss = inf , time = 0.31 Iter 15 training loss = inf , time = 0.39 ...

Could you help address this issue? Thanks.

blackjack2015 commented 4 years ago

Dear Dong,

I have fixed the bugs. You may checkout my latest dev branch. Please be careful about the learning rate when finetuning KITTI 2015/2012 (In my experiments, init_lr=1e-5 gives good results).

Best regards, Qiang Wang

Dear authors,

I try to finetune your released fadnet on kitti 2015. I run "sh finetune.sh" with the default parameters but failed. Here is the part of log:

... Number of model parameters: 144303840 Iter 0 3-px error in val = 9.296 Iter 1 3-px error in val = 4.060 Iter 2 3-px error in val = 6.256 MIN epoch 0 of round 0 total test error = 6.537 ('self.multiScales: ', [AvgPool2d(kernel_size=1, stride=1, padding=0), AvgPool2d(kernel_size=2, stride=2, padding=0), AvgPool2d(kernel_size=4, stride=4, padd ing=0), AvgPool2d(kernel_size=8, stride=8, padding=0), AvgPool2d(kernel_size=16, stride=16, padding=0), AvgPool2d(kernel_size=32, stride=32, padding=0), AvgP ool2d(kernel_size=64, stride=64, padding=0)], ' self.downscale: ', 1) [0.32, 0.16, 0.08, 0.04, 0.02, 0.01, 0.005] 0.01 Iter 0 training loss = 7.274 , time = 1.30 Iter 1 training loss = inf , time = 0.38 Iter 2 training loss = inf , time = 0.32 Iter 3 training loss = inf , time = 0.33 Iter 4 training loss = inf , time = 0.34 Iter 5 training loss = inf , time = 0.36 Iter 6 training loss = inf , time = 0.32 Iter 7 training loss = inf , time = 0.32 Iter 8 training loss = inf , time = 0.36 Iter 9 training loss = inf , time = 0.33 Iter 10 training loss = inf , time = 0.39 Iter 11 training loss = inf , time = 0.33 Iter 12 training loss = inf , time = 0.34 Iter 13 training loss = inf , time = 0.33 Iter 14 training loss = inf , time = 0.31 Iter 15 training loss = inf , time = 0.39 ...

Could you help address this issue? Thanks.

dongli12 commented 3 years ago

Dear authors,

Using the dev branch, finetuning on kitti works now. However, I finetuned your released model on kitti 2015 but the results seem to be worse than your reported number. Can you help provide your exact hyperparameters of fine-tuning to reproduce the results on kitti 2015? Here are my results for your reference.

Model	Noc: D1-bg (%)	Noc: D1-fg (%)	Noc: D1-all (%)	All: D1-bg (%)	All: D1-fg (%)	All: D1-all (%)
Reported results	2.49	3.07	2.59	2.68	3.5	2.82
Finetuned results using released FADNet	2.95	3.72	3.08	3.13	4.35	3.33

Thanks. Look forward to your reply.

Best, Dong

blackjack2015 commented 3 years ago

Hi, Dong,

We have updated the "dev" branch with the newest KITTI finetuning scheme. The results are much better than reported before. Please pull our "dev" branch for trial. Some tips are as follows.

The initial learning rate is 1e-5. Totally we finetune the network for four rounds (multi-scale loss weights). In each round, we decrease the lr by half every 200 epochs. At the beginning of each round, the lr will be reset to 1e-5.
We finetune the network with all the kitti data.

You may find our latest results in http://www.cvlibs.net/datasets/kitti/eval_scene_flow_detail.php?benchmark=stereo&result=fb66b3db2d2066965af87f7e4ca6bdc98bec2fb4 and http://www.cvlibs.net/datasets/kitti/eval_stereo_flow_detail.php?benchmark=stereo&error=3&eval=all&result=98cdc5e692e3977d3afe34b3f91db674861201a9

I would like to say the KITTI dataset is very difficult to train due to the limited sample number. It needs many tricks to reach the top tier of the leader board. We also cannot hit the top-50 now. So welcome any possible collaboration and suggestion. Thank you!

Best regards, Qiang Wang

Dong Li notifications@github.com 于2020年12月8日周二上午1:46写道：

Dear authors,

Using the dev branch, finetuning on kitti works now. However, I finetuned your released model on kitti 2015 but the results seem to be worse than your reported number. Can you help provide your exact hyperparameters of fine-tuning to reproduce the results on kitti 2015? Here are my results for your reference. Model Noc: D1-bg (%) Noc: D1-fg (%) Noc: D1-all (%) All: D1-bg (%) All: D1-fg (%) All: D1-all (%) Reported results 2.49 3.07 2.59 2.68 3.5 2.82 Finetuned results using released FADNet 2.95 3.72 3.08 3.13 4.35 3.33

Thanks. Look forward to your reply.

Best, Dong

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/HKBU-HPML/FADNet/issues/7#issuecomment-740510017, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC25VCPUQTCW3EYK722AMKLSTXYXJANCNFSM4UF2VSBQ .

HKBU-HPML / FADNet

Finetuning on KITTI failed #7