Question regarding implementation and memory usage in the DiffusionDepth model

wangjiyuan9 commented 10 months ago

Dear author:

I have been following your diffusion-depth for some time. This model is very powerful, and I really appreciate your work.

My observation comes from the difference between the resnet version and the swin version of the code. In line 373-378 of (https://github.com/duanyiqun/DiffusionDepth/blob/208f7a5b9c29432d701e77666dbd8255d784323c/src/model/head/ddim_depth_estimate_res_swin_addHAHI.py):The upsampling make 400MB GPU memory usage for each iteration. However, if we directly add the tensors, as implemented in the resnet version, this problem doesn't occur. I noticed in your comments that you have also tried not using upsampling before. So my questions are:

Did the results differ significantly when you directly added the tensors? Is upsampling necessary?
Is there a way to increase memory usage when using upsampling? If possible, it could save up to 7GB of memory, greatly speeding up training. Really hope you can provide some guidance and reply, and I look forward to your support.

wangjiyuan9 commented 10 months ago

Also about (https://github.com/duanyiqun/DiffusionDepth/issues/42). I know that the Eigen train split is a traditional split, but how to get the gt_depth which is training necessary? Did you do it as monodepth2??

FliegenderVogel commented 9 months ago

Same question!

duanyiqun / DiffusionDepth