ialhashim / DenseDepth

High Quality Monocular Depth Estimation via Transfer Learning
https://arxiv.org/abs/1812.11941
GNU General Public License v3.0
1.58k stars 354 forks source link

Training on KITTI #26

Closed OManela closed 5 years ago

OManela commented 5 years ago

Hi,

Thanks for sharing your code. I'm trying to train on KITTI and after reading all the preceding questions and answers I still have two questions:

  1. The original depth data in KITTI is 16 bit. After division by 256 you get values in meters. After running the fill_depth_colorization.py script the max range of the output depth is [0 85]. What did you do with the zeros? Should I just change them to 0? (This is what I currently do). Should I first scale the data to the range [0,255]? (Because this is, as far as I understand, the assumed range in the function NYU_BasicAugmentRGBSequence: y = np.clip(np.asarray(Image.open( BytesIO(self.data[sample[1]]) )).reshape(480,640,1)/255*self.maxDepth,0,self.maxDepth)

  2. In NYU_BasicRGBSequence - why do you divide by 10.0 in this line: y = np.asarray(Image.open(BytesIO(self.data[sample[1]])), dtype=np.float32).reshape(480,640,1).copy().astype(float) / 10.0

Should I change it to 80.0 for KITTI?

Thanks, Ofer

ialhashim commented 5 years ago
  1. Almost all depth sensors result in zero or invalid depth values. The dataset for NYU uses a depth filling technique to create fully dense depth maps. We use the exact same method to fill the KITTI dataset as a preprocessing before training as mentioned in the paper.

  2. Such a division is just to modify the range. For any other dataset you have the choice to perform the learning in any range you like. So its up to you to ensure the depth values in each batch is correct and that other parts of the code relating to visualization or inference are using the same range. This change of range issue is not essential to the performance as far as I can tell and its just a convention I used.

OManela commented 5 years ago

Thanks for the prompt reply!

Ofer

On Thu, Jun 6, 2019 at 3:39 PM Ibraheem Alhashim notifications@github.com wrote:

1.

Almost all depth sensors result in zero or invalid depth values. The dataset for NYU uses a depth filling technique to create fully dense depth maps. We use the exact same method to fill the KITTI dataset as a preprocessing before training as mentioned in the paper. 2.

Such a division is just to modify the range. For any other dataset you have the choice to perform the learning in any range you like. So its up to you to ensure the depth values in each batch is correct and that other parts of the code relating to visualization or inference are using the same range. This change of range issue is not essential to the performance as far as I can tell and its just a convention I used.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ialhashim/DenseDepth/issues/26?email_source=notifications&email_token=AKY6QPFQ2ETJK3NTG4EDM3TPZEAQPA5CNFSM4HVBHVOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXCWPNY#issuecomment-499476407, or mute the thread https://github.com/notifications/unsubscribe-auth/AKY6QPHP3F6IAAT6C6UMUD3PZEAQPANCNFSM4HVBHVOA .