Open csBob123 opened 3 years ago
Hmm, can you try a few things:
By the way, once you get some numbers you can try submitting to our EvalAI DDAD challenge! https://eval.ai/web/challenges/challenge-page/902/overview
Hmm, can you try a few things:
* Start from a pre-trained model (e.g. a KITTI model) to see if it diverges * Try another network (DepthResNet or PoseResNet) * Play around with the learning rate
By the way, once you get some numbers you can try submitting to our EvalAI DDAD challenge! https://eval.ai/web/challenges/challenge-page/902/overview
Do you use any pre-trained weights to get the result 0.173(abs_rel) on DDAD and 0.111(abs_rel) on KITTI? Or just train from scratch?
No, those are trained from scratch with PackNet. I just mentioned pre-trained weights as a way to see if there is anything wrong with the training setup that you are using.
Hi, Thanks for your work. Was the results on DDAD produced by training from scratch using the config setup provided here? https://github.com/TRI-ML/packnet-sfm/blob/master/configs/train_ddad.yaml
@a1600012888 Yes, that configuration file should work.
@a1600012888 Yes, that configuration file should work.
Thanks!
Hi, Thanks for your work. Was the results on DDAD produced by training from scratch using the config setup provided here? https://github.com/TRI-ML/packnet-sfm/blob/master/configs/train_ddad.yaml
Hi, for DDAD experiments, Did you train the model using 8 gpu cards with this config file? If so, does that means the effective batch size is 8*2=16, and learning rate is 9e-5?
Hi, Thank you for releasing the code. I am trying to train the packet on DDAD. But I can not reproduce the result so far. I use 8 v100 gpus. The training command is 'CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 horovodrun -np 8 -H localhost:8 python scripts/train.py ./configs/train_ddad.yaml' . The details of my config are as follows: model: name: 'SelfSupModel' optimizer: name: 'Adam' depth: lr: 0.00009 pose: lr: 0.00009 scheduler: name: 'StepLR' step_size: 30 gamma: 0.5 depth_net: name: 'PackNet01' version: '1A' pose_net: name: 'PoseNet' version: '' params: crop: '' min_depth: 0.0 max_depth: 200.0 datasets: augmentation: image_shape: (384, 640) train: batch_size: 8 num_workers: 8 dataset: ['DGP'] path: ['/data/ddad_train_val/ddad.json'] split: ['train'] depth_type: ['lidar'] cameras: [['camera_01']] repeat: [5] validation: num_workers: 8 dataset: ['DGP'] path: ['/data/ddad_train_val/ddad.json'] split: ['val'] depth_type: ['lidar'] cameras: [['camera_01']] test: num_workers: 8 dataset: ['DGP'] path: ['/data/ddad_train_val/ddad.json'] split: ['val'] depth_type: ['lidar'] cameras: [['camera_01']] checkpoint: filepath: './data/experiments' monitor: 'abs_rel_pp_gt' monitor_index: 0 mode: 'min'
[0]:| [2m[1m[32mE: 50 BS: 8 - SelfSupModel LR (Adam): Depth 4.50e-05 Pose 4.50e-05[0m |
[0]:||
[0]:| METRIC | abs_rel | sqr_rel | rmse | rmse_log | a1 | a2 | a3 |
[0]:| |
[0]:| [1m[35m* /data/ddad_train_val/ddad.json/val (camera_01) [0m |
[0]:|*** |
[0]:| [36mDEPTH | 0.853 | 23.485 | 37.371 | 2.022 | 0.002 | 0.005 | 0.008 [0m |
[0]:| [36mDEPTH_PP | 0.853 | 23.542 | 37.468 | 2.025 | 0.002 | 0.004 | 0.008 [0m |
[0]:| [36mDEPTH_GT | 0.268 | 12.451 | 19.267 | 0.333 | 0.705 | 0.869 | 0.936 [0m |
[0]:| [36mDEPTH_PP_GT | 0.257 | 11.199 | 18.532 | 0.324 | 0.709 | 0.873 | 0.939 [0m |
Are there any problems? Thank you for your attention.