Closed liumingcun closed 3 years ago
Hi, the pretrained network is especially trained to work on KITTI images that were resized. As such, it will only work with images that are 416x128that share the same intrinsics as KITTI.
What you can do though is try to retrain the network with videos from your own camera, and then pretrained network will work for your image.
Bear in mind that the network is very specialized so if during thraining it only sees road pictures (as it is the cas in KITTI), the quality will be poor with different scenes
Hello author, thank you very much for your answer. However, I did use kitti data when I tested, but the results are exactly the same as the results I showed.
Thank you author. I figured it out. The reason is I feed Dispnet normalize pixel values.
Thank you author. I figured it out. The reason is I feed Dispnet normalize pixel values.
Hi, @liumingcun: I met the same question as you. Should we feed Dispnet pixel values in [0, 255] ? I haven't read the code carefully, so I wonder if there is a need to modify the code in run_inference.py
- Is the network the pretrained you can download in the readme or did you train it yourself ? Some hyperparameters (especially a too strong smooth loss) will get you this degenerate network that always output the same thing.
- Are the images resized to 416x128 ?
- Are you feeding DispNet normalize pixel values, i.i. colors are not in [0, 255] but in [-1, 1] ?
Hi, @ClementPinard : I download pretrained model from the link in README,and when I test on KITTI images, I just use run_inference.py(without any modification) to get the final depth image and disp image which are exactly the same as below.
As you mentioned before:
Are you feeding DispNet normalize pixel values, i.i. colors are not in [0, 255] but in [-1, 1]
But I didn't change any code, just load some images. How could I fix it ? Thanks a lot~
You picture proposition lacks a bit of structure. Can you try with a more urban scene ? I am surprised by the results as it looks very blurry but at the same time, you are inside of a vegetation tunnel for the lower part of the screen :joy:
What do you mean by "I didn't change any code, just load some image" what were the arguments you used in the inference_script ?
You picture proposition lacks a bit of structure. Can you try with a more urban scene ? I am surprised by the results as it looks very blurry but at the same time, you are inside of a vegetation tunnel for the lower part of the screen 😂
What do you mean by "I didn't change any code, just load some image" what were the arguments you used in the inference_script ?
I change some more urban scene but the results looks still the same.
I save some urban scene imgs in ./imgs
, pretrained model in ./weights
. Then I seperately run the run_inference.py
as below:
python3 run_inference.py --pretrained ./weights/dispnet_model_best.pth.tar --dataset-dir ./imgs --output-dir ./imgs_out/disp --output-disp
python3 run_inference.py --pretrained ./weights/dispnet_model_best.pth.tar --dataset-dir ./imgs --output-dir ./imgs_out/depth --output-depth
I just feel it strange to generate invariant depth and disparity. 😔 Any mistakes I have made? 😟
Hey, sorry all about everything, there is a bug in the inference_script . Indeed, I changed the image loading function to use imageio.imread
and didn't test thoroughly enough. As a consequence I didn't see that output of imageio.imread
was in [0,1] and not [0, 255].
I changed the Line 66 from
tensor_img = ((tensor_img/255 - 0.5)/0.5).to(device)
to
tensor_img = ((tensor_img - 0.5)/0.5).to(device)
And it works much better. A fix is coming, but you can already directly applied the change I mentionned to you.
Clément
Hey, sorry all about everything, there is a bug in the inference_script . Indeed, I changed the image loading function to use
imageio.imread
and didn't test thoroughly enough. As a consequence I didn't see that output ofimageio.imread
was in [0,1] and not [0, 255].I changed the Line 66 from
tensor_img = ((tensor_img/255 - 0.5)/0.5).to(device)
to
tensor_img = ((tensor_img - 0.5)/0.5).to(device)
And it works much better. A fix is coming, but you can already directly applied the change I mentionned to you.
Clément
That works fine! Thanks a lot!
Thank you author. I figured it out. The reason is I feed Dispnet normalize pixel values.
Hi, @liumingcun: I met the same question as you. Should we feed Dispnet pixel values in [0, 255] ? I haven't read the code carefully, so I wonder if there is a need to modify the code in run_inference.py
Hello, my WeChat is 879997125, I think we can exchange related questions
Thank you author. I figured it out. The reason is I feed Dispnet normalize pixel values.
Hi, @liumingcun: I met the same question as you. Should we feed Dispnet pixel values in [0, 255] ? I haven't read the code carefully, so I wonder if there is a need to modify the code in run_inference.py
Hello, my WeChat is 879997125, I think we can exchange related questions
Yeah ,that's good idea~Check your WeChat:smile:
Hey, sorry all about everything, there is a bug in the inference_script . Indeed, I changed the image loading function to use
imageio.imread
and didn't test thoroughly enough. As a consequence I didn't see that output ofimageio.imread
was in [0,1] and not [0, 255]. I changed the Line 66 fromtensor_img = ((tensor_img/255 - 0.5)/0.5).to(device)
to
tensor_img = ((tensor_img - 0.5)/0.5).to(device)
And it works much better. A fix is coming, but you can already directly applied the change I mentionned to you. Clément
That works fine! Thanks a lot!
Hi, @ClementPinard :
The output of imageio.imread
was still in [0, 255]. The bug is in skimage.transform.resize
: if the dtype
of input img is uint8
then the resize
function will normalize the value in [0, 1] automatically, while infloat32
the resize
function just do the resize.
Hello author, I have a problem. No matter what picture I use, the final depth image and disp image are exactly the same. I don’t know the reason. Could you please help me solve it? Thank you very much .