Open SuperAKK opened 2 years ago
Hello and thanks for reaching out.
Indeed you cannot directly stereo rectify the provided depth maps.
I am in the process of cleaning my data manipulation code and I am going to release it in the following month. Until then you can follow a simple process described below to generate the disparity samples you want:
I am assuming that you are using OpenCV to stereo rectify your images (I1, I2) based on the datasets' provided calibration parameters. During this process, you used stereoRectify() which gave you two projection matrices (P1, P2) and two rectification transform matrices (R1, R2), plus a Q matrix which express the mapping between depth and disparity.
You can generate disparity maps as follows: for each sample:
If something was not clear please let me know and I will explain it better. Otherwise, I will let you know when I upload the data manipulation code.
Many thanks for your timing reply!
I will try it according to your process and update the status in time.
That's so kind of you! Thanks again! 😀
Hello dimitris!
Thank you for the detailed process of data pre-processing! I have successfully generated the disparity samples. ✌
And I have two more details to figure out:
I found that there is a small difference between the two ways, but the difference is also within 0.5 pixels. Does this difference matter?
However, the generated depth map does not seem to be correct. Is there something wrong with my backprojection process?
Looking forward to your reply! Thanks again!❤
And I also noticed that there are many outliers in the generated disparity samples, which seem really unreasonable.
For example, the left image is black, the corresponding disparity map still has values. And these values are within the disparity range, which means the values can't be masked simplely!!
Did you have the same situation? If so, what kind of post-processing did you do?
Thank you!
Hi again,
Great to hear that you were able to produce the disparity maps
1) It does not really matter which pixel (floor(pl.x) or ceil(pl.x)) you assign the disparity value to, because either way you introduce a comparable amount of error. 2) the process you described, to go from the rectified disparity to the original depth, is mostly correct, except for step 3. Instead of using P1 to project your point-cloud back to the original left frame of reference, you need to call OpenCV's projectPoints() with inputs the original left camera matrix and distortion coefficients you get from the SCARED calibration files. The function will give you the projection location of each 3D point and will account for any distortions.
As for your second question message, I am not sure what do you mean by saying that the image is black. Do you mean that you get ground truth disparity values in areas where the image depicts black tissue or that you get disparity values in areas that are padded back during the stereo rectification process? The first case is completely normal, however, the second should not happen and there is probably an error in your disparity generation code.
I suggest first making sure that your disparity generation code works and then trying to implement the disparity to depth program.
I will try to upload a small gist showing how I generated the disparity maps during the weekend. Until then feel free to ask any clarification here.
Hi again,
I uploaded a sample script containing code to generate disparity images given a .tiff ground truth file. For simplicity, I didn't include calibration loading and rectification code but it should be fairly easy to copy those in.
I will let you know when I release the full conversion repository in case you are still interested. Let me know if you need any other help.
Hello dimitris!
Really appreciate your patience! It's a great help for me!
Thanks to your detailed process, I have generated the disparity map with no errors. And I'll also run your code to make sure I'm doing it right.
There is one more question about post-processing. I didn't quite understand what you said in the post-processing part of repository: 'Because of the rectification alpha used for the test frames, this process results in pointmaps with a grid of unknown values. The last step is to interpolate the missing values using cubic interpolation.' And I want to figure out how to do the last step.
Thanks again! 😀
Hi,
No worries at all, I am happy to help.
Using the network's disparity prediction you can compute a point cloud in the left rectified frame of reference. To evaluate against the provided test sequence you need to have depth map information in the original left frame of reference. If you try to project the estimated point cloud back to the original frame of reference (after first rotating it etc), the reconstructed points may not be dense enough to project to every pixel of the left frame. Because the SCARED evaluation imposes a penalty to unknown depth values, after having projected the reconstructed point cloud to the original frame of reference, you need to interpolate missing pixel values based on adjacent depth information.
given a semi-dense depth map, you can use the following function to populate the missing depth values.
import numpy as np
from scipy import interpolate
def interpolate2d(array):
x = np.arange(0, array.shape[1])
y = np.arange(0, array.shape[0])
#mask invalid values
array = np.ma.masked_invalid(array)
xx, yy = np.meshgrid(x, y)
#get only the valid values
x1 = xx[~array.mask]
y1 = yy[~array.mask]
newarr = array[~array.mask]
out = interpolate.griddata((x1, y1), newarr.ravel(),(xx, yy),method='cubic')
return out
If you run the above code you will find it is very slow. There are certainly other methods to achieve the same thing a lot faster, but this is what I did for the challenge.
Hello dimitris!
Really appreciate your sample script. I found that the generated disparity maps are about the same, it's just that the positions of the pixels sometimes make a one-pixel difference. This may be caused by the rounded projection coordinate.
I also noticed that you set alpha=0 in stereorectify() on the train set, while set alpha=1 on the test set. And when I project test set disparity back to the depth according to the corresponding Q, the generated depth map has a lot of fine lines, as shown in the following figure.
Is this normal, or I am doing something wrong.
Thank you! ❤
Hi again,
for the training data, I used alpha=0 because I wanted the resulting stereo rectified frames to cover the whole image and not include any black borders in the periphery. The reasoning behind this was that I didn't want to train the network with images containing black patches. However, by using alpha=0 some ground truth points project outside the rectified image frames.
Again, because the SCARED evaluation protocol imposes a penalty for unknown pixels values, for the testing sequence, I stereo rectified the test set with alpha=1 to ensure that whole stereo frames are visible in the rectified views. Running inference in those frames would provide depth values for the whole original image area. converting those back depth expressed in the original frame of reference would result in depth maps with missing information in a grid pattern like the one you shared (missing values where in black parts of your pixels and also spanning the whole image, not just the middle). To populate the grid with depth values I used the interpolate2d function I shared in my previous message.
To answer your question.
I am not sure what the image you shared is showing. if the dark pixels indicate missing values then this is normal, however, I would expect the grid to span the whole image and not be limited only to the middle of the image. Furthermore, I am not sure about the scale you are using, assuming that color intensity indicates depth, your depth map is too bright, do you get reasonable depth values?
I can provide you with a conversion script to remove any guesswork but you would have to wait until the end of the week.
Hello dimitris! Really appreciate your patience!
Forgive me for not being clear. Actually, this image is a mask, and I just want to show where there is no value in the disparity image.
And I also used your pre-trained model to predict the testset, but the generated disparity maps didn't seem to be correct. As shown, the rectified left image and the prediction are provided.
Am I doing something wrong? My conda environment uses python3, and DeepPrunner recommends python2. Could this be the reason?
Thank you! ❤
Hi again,
I just downloaded the repo and weights and tested with a copy of the test keyframes rectified with alpha=0. On my end, the disparity for this particular sample gets predicted correctly (see below).
To run inference I am using the following command
python submission_scared.py --datapath {path to the folder containing the left_rect and right_rect directories} --loadmodel ./deeppruner_finetune_scared_epoch_290.tar --save_dir ./out_kfs --logging_filename test.log
Python 3 should be fine as I am also running on 3. The only problem that you may experience is in this and this files where comparisons are made using "is" instead of "==". I just found out about that and I will push updates.
Try and update the file mentioned above. If you are still having issues and you are using the same script, and weights then there is something wrong with input data samples. Check the following things:
the left rectified image is created by the left original image and the right rectified image is created by the original right image.
As for the thin black lines when your project back to the original frame, those are to be expected. Because you essentially start from a smaller area in the rectified frame of reference your projection points are not enough to span the whole original image. because during the projection you round the projection pixel coordinates you end up with this grid-like pattern not having any information. Again this is why you have to interpolate the image afterward. There are currently more efficient ways to achieve the same thing.
during the weekend I will try to post a snippet on how I converted disparities back to the original depth. Let me know if you managed to make the network work.
Hello again,
I've uploaded the script to convert the disparities to depth, expressed in the original frame of reference here.
Did you manage to make reasonable disparity predictions using the network? Another thing I forgot to mention in my previous message is that you need to rectify each dataset with the corresponding calibration parameters, each dataset has its own calibration.
Hello dimitris! Really appreciate for sharing the code.
I have successfully used your repo and weights to generate disparity map similar to yours. As shown, the keyframes (dataset8_keyframe_0) and disparity are provided.
However, you mentioned that the test keyframes are rectified with alpha=0, and the disparity map seem to be predicted on the keyframes rectified without setting alpha.
As shown below, the keyframes rectified with alpha=0 and alpha=1 are provided.
So one more thing I want to figure out is, you mentioned that the images used in training are all rectified with alpha=0, while the test images are rectified with alpha=1. Wouldn't the difference of the image contents affect the network's performance? Do I need to rectify both the train and test images with the same alpha value?
Thanks again! 😀
hello,
I am happy to see that you manage to get reasonable disparity results.
Indeed in my previous message, I used frames rectified without setting any rectification alpha, apologies If I confused you.
It should not matter if you change the alpha between your test and training set, because the stereo matching networks essentially learn to find pixel correspondences between the two images. By changing the alpha you only scale, slightly, the disparity range in your training set.
With that being said, because I only trained on only a few frames there may be some accuracy difference if you change the alpha used to stereo rectify the training set. This is what I believe, I didn't experiment with training using different alphas for rectification, therefore, I cannot give you a definitive answer.
Hello dimitris! That's so kind of you!
As said before, I have successfully used your repo and weights to generate disparity map. And then I used the file (https://github.com/dimitrisPs/DeepPruner_SCARED/blob/master/disparity_to_original_depth.py) to convert the disparity back to the depth with no problem.
As shown below, the testset8keyframe_5's Left_image and the converted Depth image are provided.
And then I measured the depth mean absolute error according to the report (mask no ground truth area and discard frames for which less than 10% of the frames have ground truth measurements). But the results seem to be inconsistent with the results of your methods reported in the paper. I take the keyframe_5 of testset8&9 for example, since the keyframe_5 is a single frame. In the paper, the MAE of d8k5 and d9k5 of your methods are 0.62mm and 0.41mm, while the results I measured are 1.91mm and 0.75mm respectively. I also measured the other keyframes, and the results were still worse than reported.
Am I doing something wrong? Sorry to bother you again, and wish you a happy weekend! 😀
Hi again,
Most likely, yes, something is wrong in your evaluation or data conversion process. I cloned this repository to evaluate those two frames and I got results that are very close to what was reported in the scared paper.
I really cannot know what the issue might be but I would check:
With the bug in the disparity_to_original_depth.py evaluation code fixed, I am getting a 0.42mm MAE error for ds9_kf4 and 0.65mm MAE error for ks8kf4.
Those values are slightly different from the ones reported in the paper and that is for two reasons:
I uploaded the inferred disparity and converted depth files from my code here. If you are able to reproduce them then the problem is in your evaluation code. Remember to change the calibration parameters in the disparity_to_original_depth.py script and save disparity/depth images after you multiply their values by 128 and convert them to unint16.
Hello dimitris! Really appreciate your detailed explanation. ❤
I measured the depth mean absolute error according to the report, and the test results are shown below:
<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">
| k1 | k2 | k3 | k4 | k5 -- | -- | -- | -- | -- | -- d8 | 7.80 | 2.11 | 1.96 | 2.58 | 0.64 d9 | 4.75 | 1.21 | 3.65 | 1.71 | 0.42
Hello dimitris!
Sorry to bother you, this is a question about the DeepPruner_SCARED Repository (https://github.com/dimitrisPs/DeepPruner_SCARED), and that repository cannot create an issue.
I noticed that you achieved great performence on that challenge and mentioned how to generate disparity samples in the DeepPruner_SCARED Repository. Can you share the data manipulation code of the data processing? This will be a great help for me.
Thank you! ❤ You have done a great work. Looking forward to your reply.