Training Code - Githubissues

HKUST-Aerial-Robotics / MVDepthNet

This repository provides PyTorch implementation for 3DV 2018 paper "MVDepthNet: real-time multiview depth estimation neural network"

GNU General Public License v3.0

308 stars 72 forks source link

Training Code #9

Closed boni-hu closed 5 years ago

boni-hu commented 5 years ago

Dear， Thank you for your great contribution,firstly. I use my own data by using example2.py, but the result is a bit poor. I think if i can train this data, the result maybe will be better. Can you give me the train code. Many thanks.

Yours sincerely, HuBoni

WANG-KX commented 5 years ago

Dear, Thanks for your interest! I do not understand I think if i can train this data. Meaning you want to train the model on your own dataset? I think you can do this if you have a good and large dataset. I can provide the training code, but without the training dataset meaning that you cannot directly use it. I have provided some information on the paper, you can firstly check the details. The code will be posted later under this issue. Regards, Kaixuan

boni-hu commented 5 years ago

Dear, Thank you for your reply sincerely. I do want to get the train code. I use a pair of images test the model by using example2.py. The result as follows. result_screenshot_18 10 2018 1

Sincerely, HuBoni

WANG-KX commented 5 years ago

Dear, In most of the cases, the baseline length and camera pose quality influence the result a lot. As shown in your example, I think the baseline is too small. Maybe you can use some threshold to select some keyframes for depth estimation. Also, this sample is from TUM RGB-D dataset right? It is included in the training set of the model. Please see the paper for more details.

boni-hu commented 5 years ago

Dear Yes,It is TUM RGB-D fireburg3 dataset. Sorry I will see the paper again carefully. Next i will train and test my own data. Many thanks again.And i want to say sorry again .Causing some inconvenience because of my poor English and i start my graduate study recently. Sincerely, HuBoni

boni-hu commented 5 years ago

Dear， Now i already have my own dataset and it's R,T and camera K.Can you give me the training code? I Thank you very much. Regards, Boni

WANG-KX commented 5 years ago

share.tar.gz Dear, Here I uploaded some files including data load (with augmentation), net model, loss define and the training script. Please note that the code is not ready to run on your environment, and it is only for reference only. The training data is not provided since it is very large and they are open sourced by other projects. You can download the data on the internet. Regards, Kaixuan

boni-hu commented 5 years ago

Dear, Thank you very much! I am very interest in your project. I want to strat this field withb this project.^_^

Sincerely, Boni

WANG-KX commented 5 years ago

Dear, Good luck! Regards, Kaixuan

0xffee00 commented 5 years ago

Thanks for providing the training code. Is the geometric data augmentation included in the scripts?

WANG-KX commented 5 years ago

Dear, Sorry, I frogot to include that part. The idea is to flip both the cost volume and the ground truth depth (or whatever you want, just make it consistent). However, Pytorch does not provide a good function to flip tensors. You can use torch.nn.functional.grid_sample to filp cost volume and the depth. The code is like:

import torch
import torch.nn as nn
import torch.nn.functional as F

class Flip_h():
    def __init__(self):
        coordinate = np.zeros((256, 320, 2))
        for i in range(256):
            for j in range(320):
                coordinate[i, j, :] = [320-j, i]
        coordinate = coordinate.astype(np.float32)
        coordinate = torch.from_numpy(coordinate).cuda().unsqueeze(0)
        self.coordinate = coordinate
        self.coordinate[:, :, 0] = self.coordinate[:, :, 0] / (320 -1) * 2.0 - 1.0
        self.coordinate[:, :, 1] = self.coordinate[:, :, 1] / (256 -1) * 2.0 - 1.0
    def filp(self, x):
        return F.grid_sample(x, self.coordinate)

class Flip_v():
    def __init__(self):
        coordinate = np.zeros((256, 320, 2))
        for i in range(256):
            for j in range(320):
                coordinate[i, j, :] = [j, 256-i]
        coordinate = coordinate.astype(np.float32)
        coordinate = torch.from_numpy(coordinate).cuda().unsqueeze(0)
        self.coordinate = coordinate
        self.coordinate[:, :, 0] = self.coordinate[:, :, 0] / (320 -1) * 2.0 - 1.0
        self.coordinate[:, :, 1] = self.coordinate[:, :, 1] / (256 -1) * 2.0 - 1.0
    def filp(self, x):
        return F.grid_sample(x, self.coordinate)

0xffee00 commented 5 years ago

Thanks. Any chance you have any of the scripts used to convert from the various dataset formats into the pickle file format used for training? The TUMDataset class seems to be missing as well.