Open FangjunWang opened 11 months ago
Did you load a pre-trained model (self-supervised pre-trained), and what's your SSL scores.
Did you load a pre-trained model (self-supervised pre-trained), and what's your SSL scores.
Thank you for your reply! Yes, I load a pre-trained model trained with script train.py with 20 epochs. What is SSL score? The training loss score is around 0.3~0.4, and the validation silog is 7.467.
Thank you for your reply! Yes, I load a pre-trained model trained with script train.py with 20 epochs. What is SSL score? The training loss score is around 0.3~0.4, and the validation silog is 7.467.
I mean, what's your SSL model's metrics, AbsRel, RMSE, etc.
Thank you for your reply! Yes, I load a pre-trained model trained with script train.py with 20 epochs. What is SSL score? The training loss score is around 0.3~0.4, and the validation silog is 7.467.
I mean, what's your SSL model's metrics, AbsRel, RMSE, etc.
AbsRel
AbsRel
Em, AbsRel = ?, I mean your SSL model's evaluation results on KITTI, not SiLog loss
AbsRel
Em, AbsRel = ?, I mean your SSL model's evaluation results on KITTI, not SiLog loss
The SSL model’s ecaluation results on KITTI are: abs_rel: 0.060, rmse: 2.642.
The SSL model’s ecaluation results on KITTI are: abs_rel: 0.060, rmse: 2.642.
That's interesting, you got better SSL scores but worse SSL+Sup scores.
Can you provide your fine-tuning args? I think you should choose a much smaller learning_rate.
Can you provide your fine-tuning args? I think you should choose a much smaller learning_rate.
--name cvnXt_075_1130 --root weights/inc_kitti_exps --load_weights_folder weights/convnext_large/cvnXt_075/models/weights_15 --epochs 5 --bs 8 --lr 1e-5 --wd 0.01 --div_factor 10 --final_div_factor 100 --validate_every 250 --dataset kitti --workers 8 --w_chamfer 0 --data_path datasets/KITTI/raw --gt_path datasets/KITTI/gts/train --filenames_file ./finetune/train_test_inputs/kitti_eigen_train_files_with_gt.txt --input_height 320 --input_width 1024 --min_depth 0.001 --max_depth 80 --do_random_rotate --degree 1.0 --data_path_eval datasets/KITTI/raw --gt_path_eval datasets/KITTI/gts/val --filenames_file_eval ./finetune/train_test_inputs/kitti_eigen_test_files_with_gt.txt --min_depth_eval 1e-3 --max_depth_eval 80 --do_kb_crop --garg_crop --same_lr
--epochs 5 --bs 8 --lr 1e-5
I recommend --bs 16 and I think lr should be smaller, 1e-6, 5e-6, etc.
--epochs 5 --bs 8 --lr 1e-5
I recommend --bs 16 and I think lr should be smaller, 1e-6, 5e-6, etc.
Thanks! I will try.
Hi @FangjunWang, I am very excited to reproduce ConvNetX results as well. However, I am currently stuck on the first stage (SSL training). I ran the training using the following command: python train.py ./args_files/hisfog/kitti/cvnXt_L_320x1024.txt
in cvnXt_L_320x1024.txt I changed only data_path, log_dir and batch_size=8 (instead of 16 as original, as I understood you did same change). in other experiments I tried also lower lr, and remove diff_lr argument, but no improvement. After that I calculated the score using command: evaluate_depth_config.py args_files/hisfog/kitti/cvnXt_L_320x1024.txt where I changed load_weights_folder to my weights path. However I tried weights from all epochs and best result is: abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 | & 0.096 & 0.765 & 4.455 & 0.176 & 0.908 & 0.966 & 0.983 \ which is much worse than original and yours, could you please help me to understand what I did wrong on first stage, so I could fix it and after move to stage 2 (finetuning).
Should I download pretrained PoseNet or other weights, or maybe I calculates the metrics in the wrong way (but I checked it on downloaded resnet model and it reproduce same score as @hisfog claimed in gitrepo). Could you please share your parameters as well as brief instruction what to do to reproduce the score. Will be very thankful for help.
Hi @FangjunWang, I am very excited to reproduce ConvNetX results as well. However, I am currently stuck on the first stage (SSL training). I ran the training using the following command: python train.py ./args_files/hisfog/kitti/cvnXt_L_320x1024.txt
in cvnXt_L_320x1024.txt I changed only data_path, log_dir and batch_size=8 (instead of 16 as original, as I understood you did same change). in other experiments I tried also lower lr, and remove diff_lr argument, but no improvement. After that I calculated the score using command: evaluate_depth_config.py args_files/hisfog/kitti/cvnXt_L_320x1024.txt where I changed load_weights_folder to my weights path. However I tried weights from all epochs and best result is: abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 | & 0.096 & 0.765 & 4.455 & 0.176 & 0.908 & 0.966 & 0.983 \ which is much worse than original and yours, could you please help me to understand what I did wrong on first stage, so I could fix it and after move to stage 2 (finetuning).
Should I download pretrained PoseNet or other weights, or maybe I calculates the metrics in the wrong way (but I checked it on downloaded resnet model and it reproduce same score as @hisfog claimed in gitrepo). Could you please share your parameters as well as brief instruction what to do to reproduce the score. Will be very thankful for help.
Hello, my parameters are: --data_path datasets/KITTI/raw/ --log_dir weights/convnext_large --model_name cvnXt_075 --dataset kitti --eval_split eigen --backbone convnext_large --height 320 --width 1024 --batch_size 8 --num_epochs 20 --scheduler_step_size 10 --model_dim 32 --patch_size 32 --dim_out 64 --query_nums 64 --dec_channels 1024 512 256 128 --min_depth 0.001 --max_depth 80.0 --diff_lr --use_stereo --load_weights_folder weights/ConvNeXt_Large_SQLdepth --eval_mono --post_process --save_pred_disps
I did not change any other things besides above parameters. Hope this helps!
@FangjunWang, thank you for quick response. I have the same parameters. do you train using only this command: python train.py ./args_files/hisfog/kitti/cvnXt_L_320x1024.txt and testing using this: evaluate_depth_config.py args_files/hisfog/kitti/cvnXt_L_320x1024.txt
also did you do something else? download some pretrained weights before training or pretrained PoseNet?
@FangjunWang, thank you for quick response. I have the same parameters. do you train using only this command: python train.py ./args_files/hisfog/kitti/cvnXt_L_320x1024.txt and testing using this: evaluate_depth_config.py args_files/hisfog/kitti/cvnXt_L_320x1024.txt
also did you do something else? download some pretrained weights before training or pretrained PoseNet?
Yes, I trained and evaluated the model use the same command. I only load a pretrained weights convnext_large_22k_1k_224.pth.
@FangjunWang, for this you should change params to: --backbone convnext_large_in22ft1k did you do it, or you manually change convnext_large to convnext_large_22k_1k_224.pth ?
@FangjunWang, for this you should change params to: --backbone convnext_large_in22ft1k did you do it, or you manually change convnext_large to convnext_large_22k_1k_224.pth ?
I changed networks/Unet.py like this: if backbone == "convnext_large": pretrained = False backbone_kwargs = {"checkpoint_path": "weights/convnext_large_22k_1k_224_filtered.pth"} encoder = create_model( backbone, features_only=True, out_indices=backbone_indices, in_chans=in_channels, pretrained=pretrained, **backbone_kwargs )
@FangjunWang, thanks. could you pls write me an email to nick_93@ukr.net, so I could directly connect to you for other questions?
@FangjunWang, I have tried convnext_large_22k_1k_224 as you suggest it provides slightly better results, however situation is similar. For resnet50 I was able to mostly reproduce the original score: abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 | & 0.084 & 0.646 & 3.972 & 0.163 & 0.923 & 0.969 & 0.983 \
But for convnext I found next situation it improves first 6-9 epochs, and after that not improve but get worse and worse. Have you get similar results or you have +- each epoch improvements? ep1 abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 | & 0.091 & 0.704 & 4.197 & 0.173 & 0.916 & 0.966 & 0.982 \ ep2 abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 | & 0.091 & 0.675 & 4.182 & 0.168 & 0.918 & 0.968 & 0.983 \ ep3 abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 | & 0.088 & 0.701 & 4.279 & 0.167 & 0.923 & 0.969 & 0.983 \ ep4 abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 | & 0.085 & 0.625 & 4.017 & 0.166 & 0.926 & 0.968 & 0.983 \ ep5 abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 | & 0.084 & 0.664 & 4.079 & 0.165 & 0.928 & 0.969 & 0.983 \ ep6 abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 | & 0.082 & 0.647 & 4.119 & 0.167 & 0.926 & 0.967 & 0.982 \ ep7 abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 | & 0.087 & 0.745 & 4.389 & 0.170 & 0.921 & 0.967 & 0.982 \ ep8 abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 | & 0.086 & 0.707 & 4.256 & 0.169 & 0.923 & 0.967 & 0.982 \ ep9 abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 | & 0.088 & 0.748 & 4.397 & 0.173 & 0.920 & 0.966 & 0.981 \
@Lavreniuk I'm in the same problem as you. I've tried the imagenet pretrained model of convnext, and the posenet provided by @hisfog (#14 ). Can you help me if you make any progress?
here is my parameters:
--data_path /mnt/RG/dataset/kitti_data --log_dir /mnt/RG/SfMNeXt-Impl/boost --model_name cvnXt_high --dataset kitti --eval_split eigen --backbone convnext_large_in22ft1k --height 320 --width 1024 --batch_size 8 --num_epochs 20 --scheduler_step_size 10 --model_dim 32 --patch_size 32 --dim_out 64 --query_nums 64 --dec_channels 1024 512 256 128 --min_depth 0.001 --max_depth 80.0 --diff_lr --use_stereo --load_weights_folder /mnt/RG/SfMNeXt-Impl/boost/cvnXt_low/models/weights_0 --eval_mono --post_process --pretrained_pose --pose_net_path /mnt/RG/SfMNeXt-Impl/checkpoints/pose
hi, @jerry-ryu , I have not reproduced the result of original repo, especially with much better result that was mentioned. I think you should train without pretrained posenet, but maybe I am wrong. But from what I found in other issues it is similar for resnet and other model, that it is impossible to reproduce it. So I switch my interest to other model.
Apologies for the delayed response, For reproducing results on KITTI,please DO NOT use the latest code release (I'm not sure what may cause these issues above). Instead, you can kindly utilize the following version by
git checkout 6a1e997f97caef8de080bb2873f71cfbad9a8047
which is consistent with the implementation of paper SQLdepth, without any additional modifications.
@Lavreniuk @hisfog Thank you for your kind response, I will try again and let you know.
@hisfog Thank you so much, I was finally able to reproduce SQLdepth on resnet50 1024x320.
I will post my experimental results and argsfiles for those who want to train SQLdepth.
-Depth metrics: paper:
ResNet50 320x1024 trained:
ConvNext 192x640 trained:
-args:
git checkout 6a1e997f97caef8de080bb2873f71cfbad9a8047
Apologies for the delayed response, For reproducing results on KITTI,please DO NOT use the latest code release (I'm not sure what may cause these issues above). Instead, you can kindly utilize the following version by
git checkout 6a1e997f97caef8de080bb2873f71cfbad9a8047
which is consistent with the implementation of paper SQLdepth, without any additional modifications.
--data_path /mnt2/RG/data
--log_dir /mnt2/RG/SfMNeXt-Impl/sqldepth_log/
--model_name resnet_320x1024
--dataset kitti
--eval_split eigen
--backbone resnet_lite
--height 320
--width 1024
--batch_size 10
--num_epochs 25
--scheduler_step_size 15
--model_dim 32
--patch_size 20
--dim_out 128
--query_nums 128
--num_features 256
--num_layers 50
--min_depth 0.001
--max_depth 80.0
--use_stereo
--load_weights_folder /mnt2/RG/SfMNeXt-Impl/sqldepth_log/resnet_320x1024/models/weights_24
--eval_mono
--post_process
(Due to lack of gpu capacity, 192x640 was used instead of 320x1024) -args:
git checkout 6a1e997f97caef8de080bb2873f71cfbad9a8047
Apologies for the delayed response, For reproducing results on KITTI,please DO NOT use the latest code release (I'm not sure what may cause these issues above). Instead, you can kindly utilize the following version by
git checkout 6a1e997f97caef8de080bb2873f71cfbad9a8047
which is consistent with the implementation of paper SQLdepth, without any additional modifications.
args files
--data_path /mnt2/RG/data
--log_dir /mnt2/RG/SfMNeXt-Impl/sqldepth_log/
--model_name cvnXt_192x640
--dataset kitti
--eval_split eigen
--backbone convnext_large
--height 192
--width 640
--batch_size 8
--num_epochs 25
--scheduler_step_size 15
--model_dim 32
--patch_size 16
--dim_out 64
--query_nums 64
--dec_channels 1024 512 256 128
--min_depth 0.001
--max_depth 80.0
--use_stereo
--load_weights_folder /mnt2/RG/SfMNeXt-Impl/sqldepth_log/cvnXt_192x640/models/weights_24
--eval_mono
--post_process
Thank you again for your wonderful code and congratulations paper accept!
p.s. I don't think there's any special change between the commit you told me and the latest code, so if you have any ideas about what made the experimental results significantly different, I'd appreciate it if you could tell me.
Thank you @hisfog and @jerry-ryu for the kind responses and sharing the experiment settings.
Background: I was in the same situation where I couldn't get the similar results to the numbers reported in the paper when using the latest code. Now knowing this issue, I'm training with the suggested branch, but curious what caused the difference in my result.
I checked the differences between the latest commit and 6a1e997f97caef8de080bb2873f71cfbad9a8047, and the most notable difference I can find was the filename changes in splits/eigen_zhou/train_files.txt
which can possibly affect the training. @hisfog, do you think this is the cause?
@NoelShin I looked it up after seeing your reply, and it seems quite reasonable. Thank you for finding it!!
Thank you @hisfog and @jerry-ryu for the kind responses and sharing the experiment settings.
Background: I was in the same situation where I couldn't get the similar results to the numbers reported in the paper when using the latest code. Now knowing this issue, I'm training with the suggested branch, but curious what caused the difference in my result.
I checked the differences between the latest commit and 6a1e997, and the most notable difference I can find was the filename changes in
splits/eigen_zhou/train_files.txt
which can possibly affect the training. @hisfog, do you think this is the cause?
hello!I notice that too.Do you know which paper the old split came from?
非常感谢,我终于能够在 resnet50 1024x320 上重现 SQLdepth。
我将为那些想要训练 SQLdepth 的人发布我的实验结果和 argsfile。
-深度指标:纸:
ResNet50 320x1024 训练:
ConvNext 192x640 训练:
ResNet50 320×1024
-参数:
- 不要使用最新的代码 realease
git checkout 6a1e997f97caef8de080bb2873f71cfbad9a8047
对于延迟的响应,我们深表歉意,为了在KITTI上重现结果,请不要使用最新的代码版本(我不确定是什么原因导致了上述这些问题)。相反,您可以通过以下方式使用以下版本
git checkout 6a1e997f97caef8de080bb2873f71cfbad9a8047
这与论文 SQLdepth 的实现一致,无需任何额外的修改。
- args文件
--data_path /mnt2/RG/data --log_dir /mnt2/RG/SfMNeXt-Impl/sqldepth_log/ --model_name resnet_320x1024 --dataset kitti --eval_split eigen --backbone resnet_lite --height 320 --width 1024 --batch_size 10 --num_epochs 25 --scheduler_step_size 15 --model_dim 32 --patch_size 20 --dim_out 128 --query_nums 128 --num_features 256 --num_layers 50 --min_depth 0.001 --max_depth 80.0 --use_stereo --load_weights_folder /mnt2/RG/SfMNeXt-Impl/sqldepth_log/resnet_320x1024/models/weights_24 --eval_mono --post_process
转换下一个 192x640
(由于 GPU 容量不足,使用 192x640 而不是 320x1024) -参数:
- 不要使用最新的代码 realease
git checkout 6a1e997f97caef8de080bb2873f71cfbad9a8047
对于延迟的响应,我们深表歉意,为了在KITTI上重现结果,请不要使用最新的代码版本(我不确定是什么原因导致了上述这些问题)。相反,您可以通过以下方式使用以下版本
git checkout 6a1e997f97caef8de080bb2873f71cfbad9a8047
这与论文 SQLdepth 的实现一致,无需任何额外的修改。
- args文件
--data_path /mnt2/RG/data --log_dir /mnt2/RG/SfMNeXt-Impl/sqldepth_log/ --model_name cvnXt_192x640 --dataset kitti --eval_split eigen --backbone convnext_large --height 192 --width 640 --batch_size 8 --num_epochs 25 --scheduler_step_size 15 --model_dim 32 --patch_size 16 --dim_out 64 --query_nums 64 --dec_channels 1024 512 256 128 --min_depth 0.001 --max_depth 80.0 --use_stereo --load_weights_folder /mnt2/RG/SfMNeXt-Impl/sqldepth_log/cvnXt_192x640/models/weights_24 --eval_mono --post_process
再次感谢您的精彩代码,并祝贺论文接受!
p.s. 我不认为你告诉我的提交和最新代码之间有任何特别的变化,所以如果你对是什么让实验结果显着不同有任何想法,如果你能告诉我,我将不胜感激。
您好我尝试复现了resnet50 640x192 ,但是得到的效果相差很多
这是我的args_res50_kitti_192x640_train.txt
--data_path /home/ccy/project/kitti_data/ --dataset kitti --eval_split eigen --height 192 --width 640 --batch_size 6 --num_epochs 25 --model_dim 64 --patch_size 16 --query_nums 120 --scheduler_step_size 15 --eval_mono --load_weights_folder /home/Process3/tmp/mdp/res50_models/weights_19 --post_process --min_depth 0.001 --max_depth 80.0 --ext jpg --model_name mdp2 --log_dir /home/ccy/tmp/
这是args_res50_kitti_192x640_eval.txt --data_path /home/ccy/project/kitti_data/ --dataset kitti --eval_split eigen --height 192 --width 640 --batch_size 6 --model_dim 64 --patch_size 16 --query_nums 120 --eval_mono --load_weights_folder /home/ccy/tmp/mdp2/models/weights_8/ --post_process --min_depth 0.01 --max_depth 80.0 --save_pred_disps
我使用的数据集就是monodepth2对应处理的kitti_data
探索很久不知道具体原因 非常期待您的回复和指导,谢谢
非常感谢,我终于能够在 resnet50 1024x320 上重现 SQLdepth。 我将为那些想要训练 SQLdepth 的人发布我的实验结果和 argsfile。 -深度指标:纸: ResNet50 320x1024 训练: ConvNext 192x640 训练:
ResNet50 320×1024
-参数:
- 不要使用最新的代码 realease
git checkout 6a1e997f97caef8de080bb2873f71cfbad9a8047
对于延迟的响应,我们深表歉意,为了在KITTI上重现结果,请不要使用最新的代码版本(我不确定是什么原因导致了上述这些问题)。相反,您可以通过以下方式使用以下版本
git checkout 6a1e997f97caef8de080bb2873f71cfbad9a8047
这与论文 SQLdepth 的实现一致,无需任何额外的修改。
- args文件
--data_path /mnt2/RG/data --log_dir /mnt2/RG/SfMNeXt-Impl/sqldepth_log/ --model_name resnet_320x1024 --dataset kitti --eval_split eigen --backbone resnet_lite --height 320 --width 1024 --batch_size 10 --num_epochs 25 --scheduler_step_size 15 --model_dim 32 --patch_size 20 --dim_out 128 --query_nums 128 --num_features 256 --num_layers 50 --min_depth 0.001 --max_depth 80.0 --use_stereo --load_weights_folder /mnt2/RG/SfMNeXt-Impl/sqldepth_log/resnet_320x1024/models/weights_24 --eval_mono --post_process
转换下一个 192x640
(由于 GPU 容量不足,使用 192x640 而不是 320x1024) -参数:
- 不要使用最新的代码 realease
git checkout 6a1e997f97caef8de080bb2873f71cfbad9a8047
对于延迟的响应,我们深表歉意,为了在KITTI上重现结果,请不要使用最新的代码版本(我不确定是什么原因导致了上述这些问题)。相反,您可以通过以下方式使用以下版本
git checkout 6a1e997f97caef8de080bb2873f71cfbad9a8047
这与论文 SQLdepth 的实现一致,无需任何额外的修改。
- args文件
--data_path /mnt2/RG/data --log_dir /mnt2/RG/SfMNeXt-Impl/sqldepth_log/ --model_name cvnXt_192x640 --dataset kitti --eval_split eigen --backbone convnext_large --height 192 --width 640 --batch_size 8 --num_epochs 25 --scheduler_step_size 15 --model_dim 32 --patch_size 16 --dim_out 64 --query_nums 64 --dec_channels 1024 512 256 128 --min_depth 0.001 --max_depth 80.0 --use_stereo --load_weights_folder /mnt2/RG/SfMNeXt-Impl/sqldepth_log/cvnXt_192x640/models/weights_24 --eval_mono --post_process
再次感谢您的精彩代码,并祝贺论文接受! p.s. 我不认为你告诉我的提交和最新代码之间有任何特别的变化,所以如果你对是什么让实验结果显着不同有任何想法,如果你能告诉我,我将不胜感激。
您好我尝试复现了resnet50 640x192 ,但是得到的效果相差很多
这是我的args_res50_kitti_192x640_train.txt
--data_path /home/ccy/project/kitti_data/ --dataset kitti --eval_split eigen --height 192 --width 640 --batch_size 6 --num_epochs 25 --model_dim 64 --patch_size 16 --query_nums 120 --scheduler_step_size 15 --eval_mono --load_weights_folder /home/Process3/tmp/mdp/res50_models/weights_19 --post_process --min_depth 0.001 --max_depth 80.0 --ext jpg --model_name mdp2 --log_dir /home/ccy/tmp/
这是args_res50_kitti_192x640_eval.txt --data_path /home/ccy/project/kitti_data/ --dataset kitti --eval_split eigen --height 192 --width 640 --batch_size 6 --model_dim 64 --patch_size 16 --query_nums 120 --eval_mono --load_weights_folder /home/ccy/tmp/mdp2/models/weights_8/ --post_process --min_depth 0.01 --max_depth 80.0 --save_pred_disps
我使用的数据集就是monodepth2对应处理的kitti_data
探索很久不知道具体原因 非常期待您的回复和指导,谢谢
我也遇到了同样的问题,我也只是改了一个数据集地址,训练后的结果精度特别低,请问您解决了吗,有没有解决方法?
非常感谢,我终于能够在 resnet50 1024x320 上重现 SQLdepth。 我将为那些想要训练 SQLdepth 的人发布我的实验结果和 argsfile。 -深度指标:纸: ResNet50 320x1024 训练: ConvNext 192x640 训练:
ResNet50 320×1024
-参数:
- 不要使用最新的代码 realease
git checkout 6a1e997f97caef8de080bb2873f71cfbad9a8047
对于延迟的响应,我们深表歉意,为了在KITTI上重现结果,请不要使用最新的代码版本(我不确定是什么原因导致了上述这些问题)。相反,您可以通过以下方式使用以下版本
git checkout 6a1e997f97caef8de080bb2873f71cfbad9a8047
这与论文 SQLdepth 的实现一致,无需任何额外的修改。
- args文件
--data_path /mnt2/RG/data --log_dir /mnt2/RG/SfMNeXt-Impl/sqldepth_log/ --model_name resnet_320x1024 --dataset kitti --eval_split eigen --backbone resnet_lite --height 320 --width 1024 --batch_size 10 --num_epochs 25 --scheduler_step_size 15 --model_dim 32 --patch_size 20 --dim_out 128 --query_nums 128 --num_features 256 --num_layers 50 --min_depth 0.001 --max_depth 80.0 --use_stereo --load_weights_folder /mnt2/RG/SfMNeXt-Impl/sqldepth_log/resnet_320x1024/models/weights_24 --eval_mono --post_process
转换下一个 192x640
(由于 GPU 容量不足,使用 192x640 而不是 320x1024) -参数:
- 不要使用最新的代码 realease
git checkout 6a1e997f97caef8de080bb2873f71cfbad9a8047
对于延迟的响应,我们深表歉意,为了在KITTI上重现结果,请不要使用最新的代码版本(我不确定是什么原因导致了上述这些问题)。相反,您可以通过以下方式使用以下版本
git checkout 6a1e997f97caef8de080bb2873f71cfbad9a8047
这与论文 SQLdepth 的实现一致,无需任何额外的修改。
- args文件
--data_path /mnt2/RG/data --log_dir /mnt2/RG/SfMNeXt-Impl/sqldepth_log/ --model_name cvnXt_192x640 --dataset kitti --eval_split eigen --backbone convnext_large --height 192 --width 640 --batch_size 8 --num_epochs 25 --scheduler_step_size 15 --model_dim 32 --patch_size 16 --dim_out 64 --query_nums 64 --dec_channels 1024 512 256 128 --min_depth 0.001 --max_depth 80.0 --use_stereo --load_weights_folder /mnt2/RG/SfMNeXt-Impl/sqldepth_log/cvnXt_192x640/models/weights_24 --eval_mono --post_process
再次感谢您的精彩代码,并祝贺论文接受! p.s. 我不认为你告诉我的提交和最新代码之间有任何特别的变化,所以如果你对是什么让实验结果显着不同有任何想法,如果你能告诉我,我将不胜感激。
您好我尝试复现了resnet50 640x192 ,但是得到的效果相差很多
这是我的args_res50_kitti_192x640_train.txt
--data_path /home/ccy/project/kitti_data/ --数据集 kitti --eval_split eigen --height 192 --width 640 --batch_size 6 --num_epochs 25 --model_dim 64 --patch_size 16 --query_nums 120 --scheduler_step_size 15 --eval_mono --load_weights_folder /home/Process3/tmp/mdp/res50_models/weights_19 --post_process --min_depth0.001 --max_depth 80.0 --ext jpg --model_name mdp2 --log_dir /home/ccy/tmp/
这是args_res50_kitti_192x640_eval.txt --data_path /home/ccy/project/kitti_data/ --dataset kitti --eval_split eigen --height 192 --width 640 --batch_size 6 --model_dim 64 --patch_size 16 --query_nums 120 --eval_mono --load_weights_folder /home/ccy/tmp/mdp2/models/weights_8/ --post_process --min_depth 0.01 --max_深度 80.0 --save_pred_disps
探索很久不知道具体原因 非常期待您的回复和指导,谢谢
Have you solved the this problem?I also use ./args_files/args_res50_kitti_192x640_train.txt . And there have been no major changes .
Author, I would like to ask you a question that has been bothering me for a long time. When I do not set the pre training pose set and do not set use_stereo. The result will become the same as above:abs_rel:0.457
@hisfog, thank you again for the excellent model and code base. I’ve successfully reproduced all the results and have made further improvements to the model. If you or @jerry-ryu, or anyone else interested, would like to check it out, I’ve shared the updated code. I’d appreciate it if you could take a look and upvote if you find it helpful (soon will upload the pretrained weighs as well): https://github.com/Lavreniuk/SPIdepth
@hisfog, thank you again for the excellent model and code base. I’ve successfully reproduced all the results and have made further improvements to the model. If you or @jerry-ryu, or anyone else interested, would like to check it out, I’ve shared the updated code. I’d appreciate it if you could take a look and upvote if you find it helpful (soon will upload the pretrained weighs as well): https://github.com/Lavreniuk/SPIdepth
I have reviewed the code you uploaded and I would like to ask if your code can run experiments on the Kitti dataset 640 * 192. I have tried to replicate the work of SQLdepth before, but the effect is quite different from the paper. My current research direction is also self supervised monocular depth estimation, so I am interested in your work with SQLdepth. Can you provide me with some information about your work or the replication of SQLdepth work? If possible, thank you very much.
@lmz-sense, could you clarify what you mean by "the effect is quite different from the paper"? Are the results worse, or is there another issue? I haven't trained at 640x192, only at a larger scale, but I believe it should work at 640x192 as well. Please check the training file in my repo, as the last commit in SQLDepth has an error in splits/eigen_zhou/train_files.txt
(as it was mentioned above).
@Lavreniuk My training results did not achieve the same training effect as the original text, and there were also some bugs during training. When evaluating with the trained weights, all indicators were fixed values, regardless of which round of weight parameters were used. The results are as follows:
what I'd recommend:
what I'd recommend:
- Check ones again that training file
- run evaluation on pretrained author weights, and numbers should be same as he claimed, otherwise you did not set up dataset in proper way.
- Run training with resnet_320x1024, I have run it and it is almost same as claimed. And after switch to your config with lower resolution, etc.
Thank you for your suggestion. I will test it again immediately based on your advice. If you have any questions, I will consult you again. Thank you very much!
@Lavreniuk Thank you so much for sharing your code. I'm already excited to read the paper and reproduce the results of the experiment.
I'd like to ask if it's possible to reproduce the results without use_stereo option or pre-trained posenet, unlike sqldepth
Once again, thank you for your contribution😀
@jerry-ryu, I was also curious about training from scratch without a pretrained pose net and why it doesn't learn jointly with the rest of the network. Through experimentation, I found that the pose net is much more fragile than the depth prediction backbone. However, by using a stronger feature extractor for the pose net, it's possible to train the entire model without pretraining it. Once training is complete, you can slightly improve accuracy by reinitializing the trained pose net and retraining from scratch, though the gains are minimal. A more powerful pose net doesn’t significantly affect training time or inference, given its small size relative to the full model. Interestingly, increasing the backbone size doesn’t help unsupervised learning much but does improve results when fine-tuning with ground truth data. I hope my code benefits the community, and I'd appreciate it if you could upvote my GitHub!
Hello and nice work! My question is how to finetune the model on KITTI? I tried with the script ./finetune/train_ft_SQLdepth.py but cannot get good enough results. Only abs_rel 0.0494 and rmse 2.182.