Closed remy-byte closed 1 week ago
@remy-byte Thank you for your interest in my project!
MAX_DEPTH
value in the evaluation script specifies that we evaluate with the ground-truth depths closer than MAX_DEPTH
meters, following Monodepth2. max_depth
option in options.py defines the maximum depth value used to scale the estimated depth (though the model actually predicts normalized inverse depth). This parameter is used during training, and I guess this is what you’re referring to. We set it to 100 because our training framework focuses on road environments, where most objects are within 100 meters. 100 is a standard choice for training on the KITTI dataset, which is another reason for selecting this value.I hope this helps! Please let me know if you have any other questions.
Great! Then I'll look into the code that is available for now and then when the LSP and the rest of the things are released I'll come back with a paper study and maybe other questions 😄. One more thing, since I'm trying to firstly reproduce the same training behaviour on Cityscapes for example in the case of changing the depth model, is there a limitation for the resolution that you chose for the training?
When training Monodepth2 on Cityscapes, we resized and cropped images to 512x192 (WxH) to align intrinsic parameters with the ones specified here. You can freely adjust these values, but please note that the car masks are provided only at the 512x192 resolution. If you choose a different image size, you'll need to resize and zero-pad the masks to fit the new resolution.
For reference, resizing and cropping are handled in the align_img_size()
in datasets/cityscapes.py
, which might be helpful for understanding the processing steps.
Hello!
First of all, I'm sending my highest regards to the whole contribution team of this project. It's a very unique idea that leverages metric scale into relative depth models. I do have some questions regarding some parts of it:
Since there is only provided code for training on Cityscapes/KITTI datasets and also car estimated heights, I'm guessing that the LSP modules and some parts of it are yet to be released?
I'm very eager to try it out with DepthAnythingV2 as a way to fine-tune its weights into producing metric depths. Is it possible to integrate it into the main pipeline's architecture like MonoDepth?
When previewing the evaluation scripts, I noticed a MAX_DEPTH value. From what I've tested with other models, such as Metric3D, that value comes into play as the whole scaling of the depth. Therefore, by setting a different value in the model, things also appear as being farther/closer to you. I'm guessing that this is also the case when talking about the MAX_DEPTH here since it is and outdoor environment. Is this a problem when training on mixed datasets? I'm wondering how this value is chosen and how it impacts the results.
Once again, thank you for making the code/weights public. I'm very eager to experiment with the architecture and see its full capabilities. 😊