tijiang13 / InstantAvatar

333 stars 23 forks source link

Novel View Synthesis Using Camera Instead Of SMPL Pose #71

Closed muyangye closed 3 weeks ago

muyangye commented 2 months ago

Hi Tianjian,

Recently I asked you in #67 regarding the global_orient parameter. First of all I would like to thank you for your response! The SMPL visualizer makes me fully understand how InstantAvatar's inference works. It seems that there are 2 types of novel view synthesis:

  1. SMPL model is fixed, input is camera's translation and rotation in world coordinates
  2. Camera is fixed, input is SMPL model's translation and rotation in relative coordinates to the camera

I think InstantAvatar uses type 2 novel view synthesis. However, I am trying to render the images on a VR headset such as Oculus via Unity, and Unity only supports type 1 novel view synthesis since there's no concept of SMPL model in Unity but Unity has built-in functions to record everything about camera like its CameraToWorld matrix. Therefore, I have simulated a circular motion of the camera around the origin (billboard) like this: 1713341567923 In this figure, the solid line represents the initial novel view synthesis process where rays from the camera pass through the SMPL model and finally hit the billboard. The dotted line represents a novel view synthesis when the camera is at a specific point on its circular orbit. Other than moving the camera itself and pointing its field of view to the SMPL model, I also moved the SMPL model's translation to the camera's translation + an offset of {+/-1, +/-1, +/-1} (sign is whatever makes it closer to the origin/in between the camera and the origin) so that there are some rays hitting the billboard. I am able to get results this way, but we can easily see that this is not the fully correct approach. Rather, the correct approach should be either one of the following 2:

  1. Move the billboard to the opposite side of the camera using the SMPL model as the symmetric center: 1713342461786

  2. Fix the camera and billboard, only rotates the SMPL model (InstantAvatar's approach): 1713342732122

I am trying to implement both 2 approaches, but I have some questions after reading the code:

  1. For approach 1, is there a way to change the translation of the billboard at all? I have not seen a way to do this in the code
  2. For approach 2, ideally I can just set the rotation of the SMPL model to move in the reverse direction of the camera in Unity (camera only moves in Unity, InstantAvatar's camera is fixed (i.e. c2w at line 43 of animate.py is still np.eye(4)). However, since the camera in Unity is also moving (i.e. translation changes), the rotation gotten from the camera pointing at the SMPL model is relative to camera's current translation. To account for that, I need to also change the SMPL model's translation. My question is, why are line 49 and line 50 of animate.py fixing the x-axis translation of the SMPL model? I thought the demo rotates around the y-axis (x and z axis' translations should change)? 1713343846283

Sorry for raising such a long and demanding issue, I know your time is valuable but I would greatly appreciate your (or anyone's) help to the above 2 questions here! Please also don't hesitate to point out if I misunderstood something. Thank you very much!

Best, Muyang

tijiang13 commented 2 months ago

Hi Muyang,

  1. You can move the billboard, as it's just a mesh. In our script the billboard is bound to the camera so we did not handle it separately.
    1. I think in general you can treat SMPL as meshes and then you can apply the same transformation to both SMPL meshes and billboard.
  2. Yes, for aist-demo, y is the upwards direction. In line 49, we move the first frame to the origin, and in line 50, we move the SMPL mesh away from the camera (setting z to 5) so that it's visible in the camera.

I hope this answers your questions. Let me know if you have any further questions.

Best, Tianjian

muyangye commented 2 months ago

Hi Tianjian,

Thanks for the quick reply! If I understand correctly, it is possible to move the billboard at line 14 of animate.py right? I am a little bit confused on how to add a translation on the x and z axis to the billboard (I still want the billboard to span on x and y axis, I just want to change its x and z translation in the world coordinate). Can you please share your knowledge on it?

Best, Muyang

tijiang13 commented 2 months ago

Hi Muyang,

If you are using this billboard you can change its position attribute directly. Otherwise you can calculate the relative transformation (RT) between the billboard and camera and re-apply it once you move the camera.

Best, Tianjian

muyangye commented 3 weeks ago

Solved. Thanks!