ifnspaml / SGDepth

[ECCV 2020] Self-Supervised Monocular Depth Estimation: Solving the Dynamic Object Problem by Semantic Guidance
MIT License
200 stars 26 forks source link

kitti_zhou_train loader #15

Closed Ale0311 closed 3 years ago

Ale0311 commented 3 years ago

Hello!

As a next step, after managing to train the model on the mapillary dataset, I would also like to experiment with different depth datasets. For starters, I only have one question. Why does the kitti_zhou_dataloader make use of the right images? Is it only to have more trainig samples? Because as far as I know, the training process is not stereo, but monocular.

Here is a snipped from the kitti_zhou_train loader:

    dataset_left = StandardDataset(
        data_transforms=transforms_common,
        **cfg_left,
        **cfg_common
    )

    dataset_right = StandardDataset(
        data_transforms=[tf.ExchangeStereo()] + transforms_common,
        **cfg_right,
        **cfg_common
    )

    dataset = ConcatDataset((dataset_left, dataset_right))

    loader = DataLoader(
        dataset, batch_size, True,
        num_workers=num_workers, pin_memory=True, drop_last=True
    )

Thanks again! 😊

klingner commented 3 years ago

Hello,

yes, the right images are only used to increase the training material. In the split defined by [Zhou et al., SfMLearner, CVPR 2017] both left and right images from the KITTI dataset are used for training.

Ale0311 commented 3 years ago

Thank you!

I do have one more question, though. Why doesn't the loader return also the ground truth for depth?

klingner commented 3 years ago

There is no special reason. It is rather that I did not need it during training so far. In principle you could also load the depth ground truth, if you would like to.