Open WendaLigit opened 1 year ago
Hi! Thanks for your interest and kind words, and sorry for the delayed response. We have included some results on monocular settings in the supplementary material. However, the training may not succeed since the predefined planes may not happen to be in a suitable position due to scale ambiguity, so we use a pre-trained PoseNet provided by Monodepth2 as a basic solution.
If you are interested in it, we can offer some minor suggestions.
loss_init = (grid_y > 0.) * |p[56] - 1.|
I hope these help, please let me know if you have any other concerns.
Thanks for your detailed information.
Hi! Thanks for your interest and kind words, and sorry for the delayed response. We have included some results on monocular settings in the supplementary material. However, the training may not succeed since the predefined planes may not happen to be in a suitable position due to scale ambiguity, so we use a pre-trained PoseNet provided by Monodepth2 as a basic solution.
If you are interested in it, we can offer some minor suggestions.
- AQUANet addresses this problem by giving a pseudo depth map in the first five epochs.
- Another solution in my experiments is to apply a loss that forces the probability in the ground region of the middle ground plane to be 1 in the first epoch. Like:
loss_init = (grid_y > 0.) * |p[56] - 1.|
I hope these help, please let me know if you have any other concerns.
Thanks for your great work! can you provide the monocular experimental code?
Thanks for your great work! can you provide the monocular experimental code?
Hi! Thank you for your interest and kind words. The command for the monocular experiment is here. If you want to use pretrained PoseNet as I did, please download mono+stereo_640x192 from monodepth2 and point towards the folder with the flag '--load_weights_folder'. Hope this helps, and feel free to reach out if you have any more questions!
CUDA_VISIBLE_DEVICES=0,1,2,3 OMP_NUM_THREADS=1 torchrun --nproc_per_node=4 train.py \
--warp_type homography_warp \
--data_path ./kitti \
--split eigen_zhou \
--log_dir ./log \
--png \
--batch_size 8 --num_workers 8 \
--learning_rate 1e-4 \
--model_name Mono_pretrainpose \
--use_denseaspp \
--num_ep 8 \
--net_type ResNet \
--pc_net vgg19 --alpha_pc 0.1 \
--alpha_smooth 0.04 \
--gamma_smooth 2 \
--use_mixture_loss \
--plane_residual \
--xz_levels 14 \
--novel_frame_ids 1 -1 \
--no_stereo \
--automask \
--load_weights_folder ./log/mono+stereo_640x192 \
--models_to_load pose pose_encoder
Hi, thanks for your impressive work! I found that in Table 2, there are no results on monocular settings for training (Only stereo and monocular plus stereo). Did you do the experiments on the monocular settings?