princeton-vl / RAFT-Stereo

MIT License
686 stars 131 forks source link

Test with imgs from my own camera #65

Closed steven-mantis closed 1 year ago

steven-mantis commented 1 year ago

Hi, I succeeded to run the program in colab with middlebury dataset and got excellent results as shown in paper. In next step, I calibed two rgb cameras with resolution 2448*2048 with opencv. With the imgs from my cameras, the disparity map seemed not bad with good boundaries like this: image However with the corresponding saved npy file, I converted it to depth and then to point cloud with pinhole camera model. The fans result looked like this: image image As can be seen, most depth values are wrong. I am not sure whether I am using your great work in a wrong way. Could you help to check with my own data? Thanks in advance and really appreciate your work. https://drive.google.com/file/d/1iMOVrYo2kkwIYLKlOF54cqPi7p_JGM4L/view?usp=share_link

Mythili-kannan commented 1 year ago

I am also facing similar issue, please let me know if you get any information on this, Thanks in advance

lahavlipson commented 1 year ago

@steven-mantis It looks like these images are not nearly rectified. For example, the top of the fan is 720px from the top in the right image and 750px from the top in the left one. This would probably explain your results.

I suggest calibrating your cameras just after or before using them.

Mythili-kannan commented 1 year ago

@lahavlipson , thanks for the reply, i have properly rectified images , but still get pointclouds as shown above, can you please suggest me how can i improve them, it will be very helpful

gpuartifact commented 1 year ago

Based on what lahavlipson observed It seems that your extrinsic calibration of the stereo camera pair was not applied to the images. Typically, a Y offset to a feature between both images of the stereo pair is a good indicator that extrinsic calib did not work or was not applied. (e.g.: link to pdf)

steven-mantis commented 1 year ago

Hi. According to @lahavlipson suggestion, I recalibrated two cameras and rectifed inputs, the results were still not good. Then I tested the pretrained model with synthetic data from Blender and got very good results. It seems that this model needs a very high calibration quality in order to correctly inference the disparity.

Here is the test result. image

Some plane areas without textures might be still problematic but for me it is reasonable.