NVlabs / Deep_Object_Pose

Deep Object Pose Estimation (DOPE) – ROS inference (CoRL 2018)
Other
1.03k stars 287 forks source link

Dope on prerecorded video #231

Closed kooshyarkosari closed 2 years ago

kooshyarkosari commented 2 years ago

Hi,

I would like to run and test DOPE on one of the Youtube videos that contain YCB objects, I don't want to use Realsence, webcam, and other cameras. so my question is that it is possible to do that without any camera and just one prerecorded video?

This is the link for the youtube video https://www.youtube.com/watch?v=m_qualdsmoA&t=743s

Best

mintar commented 2 years ago

At the very least, you would need the camera matrix (fx, fy, cx, cy) of your camera, so that you can generate a _camera_settings.json file (see below for an example). The standard way to obtain the camera matrix is to calibrate your camera using a checkerboard calibration pattern. Also, it would be best to rectify your images. You can probably get away without rectification (at a small loss of accuracy) because the video seems to be recorded using a very low-distorsion lens.

{
    "camera_settings": [
        {
            "name": "Viewpoint",
            "horizontal_fov": 90,
            "intrinsic_settings":
            {
                "resX": 533,
                "resY": 400,
                "fx": 492.79220581054688,
                "fy": 492.79220581054688,
                "cx": 276.33590698242188,
                "cy": 194.36210632324219,
                "s": 0
            },
            "captured_image_size":
            {
                "width": 533,
                "height": 400
            }
        }
    ]
}
kooshyarkosari commented 2 years ago

Dear Martin Thank you for your response, but my main problem is that, I don't want to use any camera or sensor therefore I don't need camera calibration instead I just need to run the Dope on prerecorded video in the test phase so it is possible to do that or I should use the camera for example real sense in testing part?

TontonTremblay commented 2 years ago

You could also get the intrinsics from colmap.

https://github.com/NVlabs/instant-ngp/blob/master/docs/nerf_dataset_tips.md#preparing-new-nerf-datasets check these steps, in the final json you would get the intrinsics as well as the camera poses.

If you know the phone you are using, you could probably get a bullpark intrinsics online. They wont be perfect but it would work.

TontonTremblay commented 2 years ago

You could also just run the network part without pnp.

kooshyarkosari commented 2 years ago

Dear Tonton

I don't want to use a phone camera or any sensor, I just want to test this model on this Youtube video (https://www.youtube.com/watch?v=m_qualdsmoA&t=743s) note that this video is not recorded by me I just have found it on the Youtube platform.

mintar commented 2 years ago

What @TontonTremblay and me are trying to tell you is this: No, having the video is not enough. You additionally need to provide a _camera_settings.json file, otherwise DOPE won't work. To create such a file, you need to know the fx, fy, cx, cy values (also known as "camera matrix" or more precisely "camera intrinsics"). Me and @TontonTremblay are making suggestions to you how you can obtain these values.