TRI-ML / packnet-sfm

TRI-ML Monocular Depth Estimation Repository
https://tri-ml.github.io/packnet-sfm/
MIT License
1.23k stars 242 forks source link

Fine tuning PackNet01_MR_selfsup_D.ckpt #158

Closed truncs closed 3 years ago

truncs commented 3 years ago

Hi,

I was trying to fine tune the packnet model using the set of imagery data that I have collected. The dataset is an outdoor dataset similar to KITTI/DDAD. The results that I see after 1000 epochs are very weird (I am trying to on purpose overfit) -

frame000062_depth_2

The results on the original model still makes some sense (but not completely - the left wall's depth is very inaccurate)

frame000062_depth

While fine tuning the loss did go down from 0.20 to 0.12 but I am a bit surprised that even after 1000 epochs I couldn't overfit the model. Is there suggestions of what I could try?

VitorGuizilini-TRI commented 3 years ago

It is not surprising that our pre-trained models on KITTI/DDAD don't transfer very well to this new dataset, it's a very different domain. About fine-tuning or training your own models, it should work if the dataset is compatible with self-supervised learning (i.e., it has proper camera motion with translation, brightness constancy, etc). How many images are you using to overfit, and can you check if those have proper camera motion and brightness consistency between images?

truncs commented 3 years ago

Training on 6500 images. It should have proper camera motion with translation, brightness constancy etc. Some examples of the consecutive images below (small translation between the frames). These images were taken when I was walking around the block.

frame000535 frame000536 frame000537

truncs commented 3 years ago

Actually after training for 50 epochs it's giving much better results. So you are right that pre-trained models on KITTI/DDAD don't really work well on a new set. Maybe this requires a much more bigger and a diverse set of dataset while learning intrinsics like Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unknown Cameras?

porwalnaman01 commented 3 years ago

Hello there, I am also trying to train it on a different dataset containing just images in a folder. I am also using ImageDataset but when I try to run the script, it reads the validation files but do not read the training files. My dataset is a folder containing .jpg images. Can you help me here? Thanks in advance!

Config file that I am using :

model:
    name: 'SelfSupModel'
    optimizer:
        name: 'Adam'
        depth:
            lr: 0.0002
        pose:
            lr: 0.0002
    scheduler:
        name: 'StepLR'
        step_size: 30
        gamma: 0.5
    depth_net:
        name: 'DepthResNet'
        version: '50pt'
    pose_net:
        name: 'PoseNet'
        version: ''
    params:
        crop: 'garg'
        min_depth: 0.0
datasets:
    augmentation:
        image_shape: (192, 640)
    train:
        batch_size: 4
        dataset: ['Image']
        path: ['/disk1/dan/datasets/vgg-faces/train']
        split: ['train_split.txt']

    validation:
        dataset: ['Image']
        path: ['/disk1/dan/datasets/vgg-faces/val']
        split: ['val_split.txt']

checkpoint:
    filepath: '/disk1/dan/Naman/packnet-sfm-0.1.2/experiments1'
    monitor: 'abs_rel_pp_gt'
    monitor_index: 0
    mode: 'min'