noahzn / Lite-Mono

[CVPR2023] Lite-Mono: A Lightweight CNN and Transformer Architecture for Self-Supervised Monocular Depth Estimation
MIT License
540 stars 61 forks source link

Missing opt.json #37

Closed howardchina closed 1 year ago

howardchina commented 1 year ago

Hi! I want to reimplement Lite-Mono on KITTI, but do know the right training setting. I have carefully read your paper but still can't tell the right epoch and batch size for each model. I tried num_epochs=30, but only get 0.84 a1 on KITTI. So I wish you can provide a list of the opt.json or training command of each model.

noahzn commented 1 year ago

Hi, did you load the pre-trained weights by setting --mypretrain? But even if you didn't use a pre-trained model a1=0.84 is abnormally low. Your result is even worse than Monodepth2. batchsize should be set to 12, and --epochs=30 is fine.

Have you updated the code? A training command and the options.py file are both provided in this repo.

howardchina commented 1 year ago

Hi, did you load the pre-trained weights by setting --mypretrain? But even if you didn't use a pre-trained model a1=0.84 is abnormally low. Your result is even worse than Monodepth2. batchsize should be set to 12, and --epochs=30 is fine.

Have you updated the code? A training command and the options.py file are both provided in this repo.

Oh, I just forgot to add --mypretrain! My training setting is batchsize=12, num_epochs=30, and lr=0.0001 5e-6 31 0.0001 1e-5 31. You are right. a1=0.84 is so abnormally low. I will try --mypretrain.

And my opt.json is as below. I think my setting is promising but the result is not as good as I expected.

{
  "data_path": "kitti_data_standard",
  "log_dir": "./tmp",
  "model_name": "tab3_cuda0/kitti_litemono_b12e30",
  "split": "eigen_zhou",
  "model": "lite-mono",
  "weight_decay": 0.01,
  "drop_path": 0.2,
  "num_layers": 18,
  "dataset": "kitti",
  "png": false,
  "height": 192,
  "width": 640,
  "disparity_smoothness": 0.001,
  "scales": [
    0,
    1,
    2
  ],
  "min_depth": 0.1,
  "max_depth": 100.0,
  "use_stereo": false,
  "frame_ids": [
    0,
    -1,
    1
  ],
  "profile": true,
  "batch_size": 12,
  "learning_rate": 0.0001,
  "lr": [
    0.0001,
    5e-06,
    31.0,
    0.0001,
    1e-05,
    31.0
  ],
  "num_epochs": 30,
  "scheduler_step_size": 15,
  "v1_multiscale": false,
  "avg_reprojection": false,
  "disable_automasking": false,
  "predictive_mask": false,
  "no_ssim": false,
  "mypretrain": "models/lite-mono-pretrain.pth",
  "weights_init": "pretrained",
  "pose_model_input": "pairs",
  "pose_model_type": "separate_resnet",
  "no_cuda": false,
  "num_workers": 12,
  "use_cudnn": true,
  "load_weights_folder": null,
  "models_to_load": [
    "encoder",
    "depth",
    "pose_encoder",
    "pose"
  ],
  "log_frequency": 250,
  "save_frequency": 5,
  "eval_stereo": false,
  "eval_mono": false,
  "disable_median_scaling": false,
  "pred_depth_scale_factor": 1,
  "ext_disp_to_eval": null,
  "eval_split": "eigen",
  "save_pred_disps": false,
  "no_eval": false,
  "eval_eigen_to_benchmark": false,
  "eval_out_dir": null,
  "post_process": false
}

Besides, I add the following two lines to open cudnn.

torch.backends.cudnn.enabled = True
torch.backends.cudnn.benchmark = True
noahzn commented 1 year ago

Your setting looks good. What PyTorch and CUDA versions are you using?

howardchina commented 1 year ago

Your setting looks good. What PyTorch and CUDA versions are you using?

torch.__version__ ouputs '1.8.1+cu111' CUDA version is release 11.1, V11.1.74

noahzn commented 1 year ago

It should be ok. I tested with CUDA11.0, PyTorch 1.7.1 and CUDA 11.8, PyTorch 1.12.1.

howardchina commented 1 year ago

It should be ok. I tested with CUDA11.0, PyTorch 1.7.1 and CUDA 11.8, PyTorch 1.12.1.

I also put this two lines after the optimising loop, resulting a small inital learning rate of 5e-6.

        self.model_lr_scheduler.step()
        if self.use_pose_net:
            self.model_pose_lr_scheduler.step()

Now I revised my fault and retrain it.

howardchina commented 1 year ago

I guess the problem is sourced from

  1. the missing _init_weights of PoseDecoder in my PoseDecoder.py and
  2. mode="bilinear" of upsample function in my layers.py. These two points is different from monodepth2. I forget to modified these points in my code. Now I got a1=0.888. It is very close to your paper. Thank you again!

now the issue can close.

noahzn commented 1 year ago

Good to know you found the problems!