ZM-Zhou / SDFA-Net_pytorch

Apache License 2.0
17 stars 5 forks source link

I cannot reproduce the results on the paper #7

Open jiyeooong opened 1 year ago

jiyeooong commented 1 year ago

Thanks for sharing your work!

I have reproduced the code in this repo, however I was not able to reach the results on the paper. This is my result. My RMSE(4.072) is much higher then the score mentioned in the paper(3.896).

Do you have results for the stage1’s best and last(epoch 25) score?

(My reproduction)

성희

(Paper)

성희1
ZM-Zhou commented 1 year ago

Hi, Thanks for your attention and sorry for the late reply. I think the environment (e.g. the version of Pytorch) and the batch size would be the possible reasons. Could you paste the _meta.txt file which recorded the training environment and settings? I will upload the pre-trained model of Stage1 as soon as possible and I hope it will be helpful.

jiyeooong commented 1 year ago

Thank you for the reply.

sh sh2

This is the meta file

ZM-Zhou commented 1 year ago

Hi, I think the possible reasons for your problem are as follows: 1. the parallel training of Pytorch, 2. the version of Pytorch. I paste the _meta.txt of Stage1's experiment that I did when I checked this repo. And I also upload the model of Stage1 (epoch 25), which's score is

    | abs_rel  |  sq_rel  |   rms    | log_rms  |    a1    |    a2    |    a3    |  scale   |
    |     0.100|     0.631|     4.090|     0.183|     0.890|     0.964|     0.983|     1.020|

I hope this will be helpful.

#-------------------------------------------------------------------------------
# Monocular depth esitmation with Pytorch
    -Experiment: SDFA-Net-SwinT-M_192Crop_KITTI_S_St1_B12
    -Start at: 2022-07-08_08h25m38s
#-------------------------------------------------------------------------------
# Logger Initialization Done!
#-------------------------------------------------------------------------------
# Environment Information
sys.platform: linux
Python: 3.7.11 (default, Jul 27 2021, 14:32:16) [GCC 7.5.0]
CUDA available: True
GPU Number: 1
GPU 0: Tesla V100-PCIE-32GB
CUDA_HOME: /share/home/zhouzhengming/cuda-10.2
NVCC: Cuda compilation tools, release 10.2, V10.2.89
GCC: gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36)
PyTorch: 1.7.0
PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v1.6.0 (Git Hash 5ef631a030a6f73131c77892041042805a06064f)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 10.2
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  - CuDNN 7.6.5
  - Magma 2.5.2
  - Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 

TorchVision: 0.8.1
#-------------------------------------------------------------------------------
# Options
  - batch_size: 12
  - best_compute: depth_kitti
  - beta1: 0.5
  - clip_grad: -1
  - decay_rate: 0.5
  - decay_step: [30, 40]
  - epoch: 25
  - exp_name: SDFA-Net-SwinT-M_192Crop_KITTI_S_St1_B12
  - exp_opts: dev_options/SDFA-Net/train/sdfa_net-swint-m_192crop_kitti_stereo_stage1.yaml
  - learning_rate: 0.0001
  - local_rank: 0
  - log_dir: ./train_log
  - log_freq: 100
  - metric_name: ['depth_kitti']
  - num_workers: 8
  - optim_name: Adam
  - pre_model: None
  - save_freq: 5
  - seed: 2048
  - start_epoch: None
  - visual_freq: 2000
  - weight_decay: 0.5
#-------------------------------------------------------------------------------
# Used Dataset
    -train Datasets
      get 22600 of data
        dataset_mode: train
        split_file: data_splits/kitti/train_list.txt
        full_size: [384, 1280]
        patch_size: [192, 640]
        random_resize: True
        normalize_params: [0.411, 0.432, 0.45]
        flip_mode: img
        color_aug: True
        output_frame: ['o']
        multi_out_scale: None
        load_KTmatrix: False
        load_depth: True
        load_depthhints: False
        is_fixK: True
        stereo_test: False
        jpg_test: False
        improved_test: False
      1883 iters with 8 workers
    -val Datasets
      get 697 of data
        dataset_mode: val
        split_file: data_splits/kitti/test_list.txt
        full_size: [384, 1280]
        patch_size: None
        random_resize: True
        normalize_params: [0.411, 0.432, 0.45]
        flip_mode: None
        color_aug: True
        output_frame: ['o']
        multi_out_scale: None
        load_KTmatrix: False
        load_depth: True
        load_depthhints: False
        is_fixK: True
        stereo_test: False
        jpg_test: False
        improved_test: False
      697 iters with 8 workers
# Datasets and Dataloaders Initialization Done!
#-------------------------------------------------------------------------------
# Model and Losses
    -SDFA_Net
    -params: 32.269788
      encoder: orgSwin-T-s2
      decoder: SDFA
      out_num: 49
      min_disp: 2
      max_disp: 300
      image_size: [192, 640]
      feat_mode: vgg19
      distill_offset: False
    -losses
      photo_l1 : rate=1.00000
        pred_n: synth_img_{}
        target_n: color_{}_aug
        l1_rate: 1
        l2_rate: 0
        ssim_rate: 0
        other_side: True
      perceptual-1 : rate=0.01000
        pred_n: synth_feats_0_{}
        target_n: raw_feats_0_{}
        l1_rate: 0
        l2_rate: 1
        ssim_rate: 0
        other_side: False
      perceptual-2 : rate=0.01000
        pred_n: synth_feats_1_{}
        target_n: raw_feats_1_{}
        l1_rate: 0
        l2_rate: 1
        ssim_rate: 0
        other_side: False
      perceptual-3 : rate=0.01000
        pred_n: synth_feats_2_{}
        target_n: raw_feats_2_{}
        l1_rate: 0
        l2_rate: 1
        ssim_rate: 0
        other_side: False
      smooth : rate=0.00080
        pred_n: disp_{}
        image_n: color_{}
        more_kernel: True
        map_ch: 1
        gamma_rate: 2
        gray_img: True
        relative_smo: False
# Model Initialization Done!