Open puyiwen opened 2 years ago
Both needed. I set channel numbers decreasingly (like 512, 256, 128, 64) by experience and reduce the numbers by experience (reduce each to a small number).
Thank you for your answer. Regarding the data enhancement part, I have always been very confused. I wonder if the depth map rotation doesn't change the ground truth? Do I need to supplement the depth value after cropping the image? For example, if the original image is 640×480, cropped to 600×400, and then resized to 640x480 (network input size), then it is equivalent to the camera position becoming closer, and the depth value should be scaled down?
When I was doing the experiment, I found a problem that your LiteDepth structure performs better than fastdepth on any indicator of NYU_Depth_v2 data, but when I took a few real indoor pictures alone for demo, I found that LiteDepth depth estimation effect is worse than fastdepth, why is this? Is my demo image accidental?
After cropping, there is no need to resize it back to the raw resolution. The resizing will cause a change in pixel size, leading to performance degradation of models.
That's related to the model generalization. May the LiteDepth suffers from a generalization issue. That's also an interesting topic but there is little literature. I also observe this issue in my DepthFormer and BinsFormer. While the metrics are good, it is not practical for application. So, I think that is actually something important.
Hi,a great work! I have some questions about the model. How do you set the number of upsampled channels? Whether it is derived by experiment or by experience? Thank you very much!!