Closed ramuneblue closed 3 years ago
Sorry, I don't quite understand your first question. You can take the max over all values, but that won't be informative as MiDaS only provides relative depth.
You can find a Colab here: https://colab.research.google.com/github/pytorch/pytorch.github.io/blob/master/assets/hub/intelisl_midas_v2.ipynb
Thank you for your quick comment.
I see, then could you tell me how and from which function I can get the max & min depth values in one single image?
I would like to check where is the deepest point in one image, and also compare which one is the deepest among several still images. If the "depth" is relative, still then it would be possible to check it using both max & min values.
I can reach a Colab, thank you very much! It's really great.
The result in the Colab is a numpy array, so you can simply use numpy.max and numpy.min
Thank you for a simple answer, Actually I'm not a person of python, so. I'll try to get them from there.
Thank for this! How could we then convert the output to meters? There seem to be many questions about this in the 'issues' but I'm having trouble understanding how to do this. Could you show us here or in the colab with how to do that?
Say I have figured out what my camera intrinsics are, like in #5,
{
"height" : 480,
"intrinsic_matrix" :
[
610.0023193359375, #fx
0.0,
0.0,
0.0,
609.85760498046875, #fy
0.0,
425.36004638671875, #cx
237.9273681640625, #cy
1.0
],
"width" : 848
}
You'd need to know the absolute depth of at least two pixels in the image to derive the two unknowns. Based on these measurements you could align the predictions to these measurements as done in our SSIMSE loss.
But how can we use the SSIMSE loss in your paper to estimate the shift and scale in an image given two pixel GT depths?
See the code here https://gist.github.com/ranftlr/45f4c7ddeb1bbb88d606bc600cab6c8d
compute_scale_and_shift
computes the scale and shift that aligns the the prediction to the target as in the SSIMSE
Thank you for your quick comments. Let me close this question.
Hi, Before launching your code, I would like to understand how to get a maximum depth by numeric, and the x/y axis. Could I have your advice on it? If you could tell me the specific key function where generates the maximum depth, it would be really helpful.
Allow me to add one more question, is your code available to use on the google colaboratory?
I'm looking forward to hearing from you soon!