sony / genwarp

MIT License
223 stars 19 forks source link

Camera pose to src_view_mtx #8

Open liuxiaoyu1104 opened 2 months ago

liuxiaoyu1104 commented 2 months ago

If I have a camera pose from Droid-SLAM in the format 'timestamp tx ty tz qx qy qz qw', how do I get src_view_mtx? Besides converting the quaternion(qx qy qz qw) to a rotation matrix and combining it with the translation vector(tx ty tz) to form the RT matrix, is there anything else I need to do?

kaz-sony commented 2 months ago

Hi @liuxiaoyu1104 , Thank you for your interest. First of all, let me clarify that I have never used Droid-SLAM before and I'm not going to say anything specific to it. In any case, assuming you meant src_view_mtx in the README example, the variable is to specify the source camera pose from which the reference input image was taken. Then, you also need to set the target camera pose(tar_view_mtx) to calculate the relative camera pose(rel_view_mtx), which is then input to the GenWarp model as a condition. Since the model only receives the relative camera pose, it doesn't matter which coordinate system you choose to describe source and target camera positions so long as the relative camera pose is right. So I assume that, inasmuch as you create the target camera pose in the same way that the source camera pose is derived, there's not going to be anything special you need to do although I can't be sure. I recommend just to give it some trial and error until you get your expected result.

liuxiaoyu1104 commented 2 months ago

Thank you for your reply. These are my two RT matrices, which I use as src_view_mtx and tar_view_mtx: tensor([[-1., 0., 0., -0.], [ 0., -1., -0., -0.], [-0., 0., 1., -0.], [ 0., 0., 0., 1.]]) tensor([[-0.5072, -0.1023, -0.8557, 0.3367], [-0.0926, -0.9807, 0.1721, -0.1505], [-0.8568, 0.1665, 0.4880, 1.4396], [ 0.0000, 0.0000, 0.0000, 1.0000]]) However, I feel that the coordinate system and the one obtained from camera_lookat don't seem to align properly.

kaz-sony commented 2 months ago

You don't need to have your view matrices matched to the ones returned from camera_lookat function. The function is only a helper to create a view matrix with a predetermined coordinate system. Like I said, all you need is the relative camera pose. Did you try GenWarp with your matrices already? How was the resulting image? It's not even a problem before actually trying to generate a sample yet.

ewrfcas commented 3 weeks ago

Similar question. How to convert opencv camera to GenWarp coordinates? I used [[0, 0, -1], [1, 0, 0], [0, -1, 0]] but it seem not working.