nianticlabs / monodepth2

[ICCV 2019] Monocular depth estimation from a single image
Other
4.13k stars 953 forks source link

obtained some very strange depth maps #483

Open lhx0416 opened 11 months ago

lhx0416 commented 11 months ago

Thank you very much for your work! I further trained the mono model you provided with my own dataset (79 indoor-collected images with a resolution of 1280*704, and I modified the camera parameters and resolution in the mono_dataset). However, I obtained some strange images. From top to bottom, they are the input RGB original image, the depth map directly predicted by the model provided by the original author, and the depth map obtained after training the pre-trained model on my custom dataset (which doesn't seem to be a depth map :( (HSBUMBPOP4NT6IC IBX8A

RJH97 commented 9 months ago

I got the same weird results. Have you solved this?

lhx0416 commented 9 months ago

I got the same weird results. Have you solved this?

I haven't resolved this issue yet. Initially, I thought it was due to the insufficient size of the data sample, but experiments later revealed it's unrelated to the quantity of training. I speculate it might be due to the excessive continuity of the data captured by the phone (60 frames per second), or it could be an issue with the camera itself. The image resolution captured is 1920*1080 (perhaps this resolution is too high?). I'm still actively working on resolving this problem, and I'll keep you updated on any progress. If you successfully tackle this issue, please do let me know too. Thank you.

RJH97 commented 9 months ago

Thank you, my custom dataset has about 25w samples, but the same problem still happened. Reducing the frames sampled per second may be a solution, I 'll reach you if there is any progress.

lhx0416 commented 8 months ago

Thank you, my custom dataset has about 25w samples, but the same problem still happened. Reducing the frames sampled per second may be a solution, I 'll reach you if there is any progress.

I conducted another experiment and found that while the depth map obtained in the first training round is not accurate, it resembles a depth map. However, in subsequent epochs, the training results become peculiar, with the depth map exhibiting object textures.

The changes I made to the code are in Monodepth2/datasets/kitti_dataset.py:

Original

self.K = np.array([[0.58, 0, 0.5, 0],

[0, 1.92, 0.5, 0],

[0, 0, 1, 0],

[0, 0, 0, 1]], dtype=np.float32)

Modified

self.K = np.array([[0.607, 0.000, 0.502, 0], [0.000, 0.809, 0.499, 0], [0, 0, 1, 0], [0, 0, 0, 1]], dtype=np.float32)

Original

self.full_res_shape = (1242, 375)

Modified

self.full_res_shape = (640, 480) I haven't made any other changes. It's also possible that my training data is insufficient, so I plan to drive tomorrow to collect more data.