nerfstudio-project / viser

Web-based 3D visualization + Python
https://viser.studio/latest
Apache License 2.0
852 stars 51 forks source link

How to Obtain an Image Rendered by a Camera Frustum from Its Pose? #335

Open Unimendacity opened 1 week ago

Unimendacity commented 1 week ago

Could you explain how to obtain an image rendered by a camera frustum at a specific pose based on camera's params(fov, aspect, scale) and how to visualize the image through other windows? Just like the picture attached. Thank you! vuer_images_rendered_by_camera_frustum

brentyi commented 1 week ago

Hi there!

There's a get_render() method that can help you with the rendering:

https://viser.studio/latest/client_handles/#viser.ClientHandle.get_render

For showing images (including the rendered one), one option is to use Plotly:

https://viser.studio/latest/examples/23_plotly/

Unimendacity commented 1 week ago

Thank you for your response! I was able to render images using viser.ClientHandle.get_render, but I’ve encountered a few issues:

  1. The images obtained via viser.ClientHandle.get_render are not rendered from the specified camera pose. Additionally, when I move the current view in the web interface, the rendered image changes accordingly.
  2. Rendering an image after updating the camera pose is very time-consuming. Based on my measurements, viser.ClientHandle.get_render takes approximately 0.5 seconds. Furthermore, the main view in the web interface becomes blurry after the camera pose is updated as shown in the attached picture.

Lastly, I would like to know if the URDF pose in viser can be updated relative to the world coordinate system. I couldn’t find any relevant API for this in ViserUrdf.

Thank you in advance for your help!

viser_gen_render_bug_01 viser_gen_render_bug_02 viser_gen_render_bug_03

brentyi commented 1 week ago

For (1) I just wrote a test script; rendering works fine for me with arbitrary camera poses. Can you see if you can reproduce this?

https://github.com/user-attachments/assets/90efcf6d-ed49-4356-bdf4-12054e854a50

Here's the script: https://gist.github.com/brentyi/e5942d8a5630f20115d52084cfbd2835

For (2) speed I unfortunately don't have a good solution. Part of this could be plotly overhead; plotly is slow. I'm planning to add a more performant API for sending images (cc #294). The rendering also all happens in your web browser which is convenient for many reason but as a result there's overhead from both rendering and network communication.

If you want a quick fix for the blurriness issue you can install viser from source with this line deleted: https://github.com/nerfstudio-project/viser/blob/9096d190e971e90f79424b7f8a4ef79c008483d2/src/viser/client/src/App.tsx#L471

For (3) moving the root coordinate frame of the URDF, you can do:

server.scene.add_frame("/robot_root_frame", show_axes=False, wxyz=..., position=...)

And then pass "/robot_root_frame" in for this ViserUrdf argument: https://github.com/nerfstudio-project/viser/blob/9096d190e971e90f79424b7f8a4ef79c008483d2/src/viser/extras/_urdf.py#L32

Unimendacity commented 1 day ago

Thank you for your response. I tried the solution you provided but encountered the following issues:

  1. As shown in the video, get_render correctly renders the image of the world_axes under the camera pose. However, it fails to correctly render the image from Gaussian Splats. Instead, the rendered image from Gaussian Splats correspond to the user's view, not the specified camera pose.

  2. When trying to retrieve the pose of a joint in the URDF (e.g., "J_head_yaw") using the code below, I noticed that changing the joint states of its parent (e.g., "J_head_pitch") does not update the pose of "J_head_yaw". The code used to get the joint pose is as follows:

    joint_index = list(self.viser_urdf._urdf.joint_map.keys()).index(self.config.joint_name)
    frame_handle = self.viser_urdf._joint_frames[joint_index]

Thank you for your help!

https://github.com/user-attachments/assets/fee92b00-13df-4266-b064-a9c539b52009

brentyi commented 1 day ago

(1) is a bug I hadn't considered, thanks for pointing that out! I fixed this in #344.

For (2), are you accessing the pose of a frame with frame_handle.wxyz and frame_handle.position? If so this may be expected behavior, since poses in viser are all relative to a node's parent. If you need an absolute transform you'll either need to manually chain transformations based on the kinematic tree or use the underlying yourdfpy object.