IDLabMedia / open-dibr

MIT License
19 stars 1 forks source link

How to use open-DIBR on custom data? #3

Open liustu opened 3 months ago

liustu commented 3 months ago

Hi, would you like to provide the solution to apply the open-DIBR on custom data? Thank you.

jlartois commented 3 months ago

Thank you for your question. Once you have gone through the information in the wiki, you can follow these steps on your own dataset:

  1. I assume you have a set of color images/videos.
  2. To estimate the intrinsics and extrinsics, you can use a program like COLMAP. For example, if you do the "automatic reconstruction" and then the "export as txt" steps, you get a cameras.txt and images.txt file. These contain the intrinsics, as well as the position and quaternion rotation of each view. Quaternions can be converted to euler angles (OpenDIBR expects degrees). Note that COLMAP uses its own axial system, of course.
  3. You will need to estimate the depth of each view, and save these as images or videos, as explained in the wiki.
  4. Lastly, the JSON file that serves as an input for OpenDIBR. The main thing here is that OpenDIBR by default expects the position and rotation to be in the OMAF axial system. You could change this by commenting out these 2 lines in src/ioHelper.h:
    omafToOpenGLPosition(/*out*/pos);
    omafToOpenGLRotation(/*out*/rot);

    Now, the program assumes that everything uses the OpenGL axial system. The only difference with the COLMAP axial system is that the Y and Z axes are flipped, as described here:

    The local camera coordinate system of an image is defined in a way that the X axis points to the right, the Y axis to the bottom, and the Z axis to the front as seen from the image.

    Good luck!

liustu commented 3 months ago

Thank you for your question. Once you have gone through the information in the wiki, you can follow these steps on your own dataset:

  1. I assume you have a set of color images/videos.
  2. To estimate the intrinsics and extrinsics, you can use a program like COLMAP. For example, if you do the "automatic reconstruction" and then the "export as txt" steps, you get a cameras.txt and images.txt file. These contain the intrinsics, as well as the position and quaternion rotation of each view. Quaternions can be converted to euler angles (OpenDIBR expects degrees). Note that COLMAP uses its own axial system, of course.
  3. You will need to estimate the depth of each view, and save these as images or videos, as explained in the wiki.
  4. Lastly, the JSON file that serves as an input for OpenDIBR. The main thing here is that OpenDIBR by default expects the position and rotation to be in the OMAF axial system. You could change this by commenting out these 2 lines in src/ioHelper.h:
omafToOpenGLPosition(/*out*/pos);
omafToOpenGLRotation(/*out*/rot);

Now, the program assumes that everything uses the OpenGL axial system. The only difference with the COLMAP axial system is that the Y and Z axes are flipped, as described here:

The local camera coordinate system of an image is defined in a way that the X axis points to the right, the Y axis to the bottom, and the Z axis to the front as seen from the image.

Good luck!

Hi, it means if I directly use the extrinsic estimated by the colmap, I just need to first comment out "omafToOpenGLPosition(/out/pos); omafToOpenGLRotation(/out/rot);" , and then use the (yaw,-pitch,-roll)in degrees manner as rotation and translation vector in extrinsic as position? I try to do this, but I still can not get the accurate results. Would any error in my process? Thank you.

jlartois commented 3 months ago

Commenting out

omafToOpenGLPosition(/out/pos);
omafToOpenGLRotation(/out/rot);

is correct, as well as using the translation vector in the extrensics for the Position. However, for the rotation, the correct order is (pitch, yaw, roll), corresponding to the rotation around the X, Y and Z axis respectively. Indeed these (euler) angles should be in degrees.

I think you missed the last step that I mentioned. OpenDIBR uses the OpenGL coordinate system, which is not the same as the COLMAP coordinate system. The difference is that the directions of both the Y and Z axis should be inverted.

One thing that I also missed in my previous reply is that the custom data images should be undistorted.

If this still doesn't work, feel free to email me a minimal reproducible example.

bing-stu commented 3 months ago

Hi, I try again as suggested, however,it fails again. I have provided some examples in attachment. attachment.zip Would you like to help me how to transform those parameters to the accurate position,rotation, focal, principle_point? Thank you.Looking forward to your reply.

jlartois commented 3 months ago

Hello again. I pushed an update to OpenDIBR to support 3 axial systems: OMAF, COLMAP and OPENGL. I've also created a python script for your DTU dataset (that seems to be prepared for something like MVSNet) to generate the input .json file for OpenDIBR. You can download it, as well as the .json file, from here.

If I run the latest commit of OpenDIBR with this json, it works:

RealtimeDIBR -i some_folder -j some_folder/example_colmap.json --cam_speed 1

There is still some misalignment, but I think this is because of how OpenDIBR reads in the depth map. It uses the "Depth_range" = [near, far] from the .json file to convert the 8-bit depth to a Z-depth:

depth = 1.0 / (1.0f / far + depth * ( 1.0f / near - 1.0f / far));

I assumed that the last 2 values in cams/00000000_cam.txt are these [near, far], but I could be wrong. It all depends on how the depth PNGs were normalized.