visinf / self-mono-sf

Self-Supervised Monocular Scene Flow Estimation (CVPR 2020)
Apache License 2.0
248 stars 47 forks source link

.sh scripts problem #6

Closed chengrongliang closed 4 years ago

chengrongliang commented 4 years ago

Hi @hurjunhwa those configs seem not be used in main.py

--training_dataset_root=$KITTI_RAW_HOME \
--training_dataset_flip_augmentations=True \
--training_dataset_preprocessing_crop=True \
--training_dataset_num_examples=-1 \
hurjunhwa commented 4 years ago

Hi @chengrongliang Those configs are for the dataloaders in https://github.com/visinf/self-mono-sf/tree/master/datasets

chengrongliang commented 4 years ago

Yes, but those params in the .sh does't pass in the dataloader configuration. I cannot find those variables in the core/configuration.py, it seems that the dataloader loaded the default variables. Did I missing something?

chengrongliang commented 4 years ago

Very sorry @hurjunhwa , those variables are effective, but I did not find how you pass in the .py function. ^_^

chengrongliang commented 4 years ago

I have 2 question about the paper:

  1. In Figure 3 you said residual scene flow, what the residual stand for specifically?
  2. we adopt the monocular depth estimation approach of Godard et al. [12, 13] as our second basis, is another way to run the monocular image(left image) training scene flow(not depth) using this repository? Without using right image.
hurjunhwa commented 4 years ago

Hi,

  1. Here, the residual means that the model updates the estimation incrementally over the pyramid. For example, if the output from the previous pyramid level (say, l-1) is sf_(l-1), then at the current pyramid level (say, l), the model estimates the residual value sf_(l, res) and updates the final output to be sf_(l) = sf_(l-1) + sf_(l, res)

  2. Yes I think it's doable. You can turn off the disparity loss and train the networks only using the scene flow loss. However (from my previous experience), the networks was not converging well due to the depth ambiguity. I haven't tried, but maybe using the pose estimator and adding the relevant loss would help the network to be converged better, like other related works,

chengrongliang commented 4 years ago

Thank you very much. I will have a try.