Closed liyangliu closed 4 years ago
Hi,
The pre-processed CityScape depth is inverse depth, so it would be easier to represent infinite-depth such as sky.
In the original paper, we directly use this inverse depth as ground-truth depth, and did not apply additional data processing.
Hope this helps.
Hi,
Could you maybe elaborate on how you get the gt inverse depth value ? you divide all the inverse depth values by a maximum number ? What is this number ?
The inverse depth actually is the original cityscapes disparity data. As claimed in the previous comment, we did not apply any further processing.
Thank you for your reply.
The original cityscapes disparity data has values ranging from 0 to 32257. The inverse depth values provided in your dropbox ranges from 0 to 0.4922.
This means that some processing has been done to bring original disparity values to the range 0 to 1 right ? Could you maybe provide this details.?
That is because you probably were using Image.open to decode the image file, which has a different data type. Try to use plt.imread, then you should get identical values.
thank you.. I see that plt.imread I am able to get the same values..
"binary_mask = (torch.sum(x_output, dim=1) != 0).unsqueeze(1).to(device)"
In this depth error binary mask, could you let me know whether the goal is to avoid non zero pixels in every ground truth depth map ?
Thank you.
Since those ground-truth depths were recorded with a real-life measuring sensor, they were typically not perfect. If you visualise the depth, you can easily observe that those 0-values are invalid -- meaning no valid depth values were recorded.
Hope that helps.
Thank you.. That was very helpful.
Thank you for your reply.
The original cityscapes disparity data has values ranging from 0 to 32257. The inverse depth values provided in your dropbox ranges from 0 to 0.4922.
This means that some processing has been done to bring original disparity values to the range 0 to 1 right ? Could you maybe provide this details.?
I have the same confusion. Have you solve the problem ?
That is because you probably were using Image.open to decode the image file, which has a different data type. Try to use plt.imread, then you should get identical values.
As @lorenmt suggested, after using plt.imread I was getting the same range as in the data provided in dropbox..
Thank you ! When i use plt.imread, it's the same as the provided data.
But I have another question.The official cityscapes git page https://github.com/mcordts/cityscapesScripts says
"disparity precomputed disparity depth maps. To obtain the disparity values, compute for each pixel p with p > 0: d = ( float(p) - 1. ) / 256., while a value p = 0 is an invalid measurement. Warning: the images are stored as 16-bit pngs, which is non-standard and not supported by all libraries."
The data's max value is 0.4922, meaning the disparity values are all <0. Because (0.4922-1)/256<0. The disparity values ought to be <0 ? It confuse me much. >.<
Thank you for your reply.
The original cityscapes disparity data has values ranging from 0 to 32257. The inverse depth values provided in your dropbox ranges from 0 to 0.4922.
This means that some processing has been done to bring original disparity values to the range 0 to 1 right ? Could you maybe provide this details.?
Maybe I know.. in plt.imread, if the imge is .png , the function will return the float value [0,1], which is calculated from
the real value/ the bit depth(65535) .
The plt.imread doc says: "PNG images are returned as float arrays (0-1). All other formats are returned as int arrays, with a bit depth determined by the file's contents." look here
0 to 32257 /65535 ->0 to 0.4922
I would highly suggest check out this post: https://github.com/mcordts/cityscapesScripts/issues/55#issuecomment-411486510 on computing the real depth.
Using pltimead here is more like an approximation, you need focal length to fully convert it into the real depth.
Hi, @lorenmt. I downloaded your processed Cityscapes dataset and found that the values in those numpy arrays are >= 0 (probably most of them are <0.5). And when I load the official Cityscapes disparity, the values are also >=0, but are much larger (maybe ~30000). Would you tell me how do you pre-process the original disparity data to get those numpy arrays? Thanks in advance.