nerfstudio-project / nerfstudio

A collaboration friendly studio for NeRFs
https://docs.nerf.studio
Apache License 2.0
9.43k stars 1.28k forks source link

ns-viewer cannot establish connection to websocket (Google Colab) #3141

Open Luca-Wiehe opened 5 months ago

Luca-Wiehe commented 5 months ago

Bug Description When using ns-viewer in Google Colab, Viser fails to establish a connection to the visualized training instance.

I was hoping to solve this issue by setting the --viewer.make-share-url True flag. A shared URL gets printed when using ns-train but not when using ns-viewer.

As a solution, I would appreciate a suggestion on how to access the shared URL of the ns-viewer or a way to resolve the connection issue with Colab's /localhost:7007/.

To Reproduce Steps to reproduce the behavior:

  1. Have some trained scene available or train a scene using
     !ns-train <model_name> --viewer.make-share-url True --data /content/.../path_to_data

    Along with some training output, a viewer link is printed to the console.

    Viewer at: http://localhost:7007/ or https://input-inference.share.viser.studio/

  2. Use the following code to find out where Google Colab launches its localhost:7007:
    from google.colab import output
    output.serve_kernel_port_as_window(7007)

    https://localhost:7007/

  3. Access the resulting config.yaml to load the viewer. Use
     !ns-viewer --viewer.make-share-url True --load-config /content/.../config.yml

Expected behavior After step 3, I would expect the output to be available through the Colab localhost from step 2. While the port can be accessed by ns-viewer (indicated by a Viser instance on that port), it will always remain in the "Connecting..." state.

Additionally, I would expect a shared link to be printed (as is the case when using ns-train) such that a publicly accessible viewer instance can be accessed.

These two errors combined disallow visualizing any pretrained instances which makes nerfstudio practically incompatible with Google Colab.

Screenshots Viser never leaves the "Connecting..." state.

Screenshot 2024-05-13 at 07 57 06

Additional context Here is the full console output that is printed when executing ns-viewer:

2024-05-13 05:54:28.780850: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-05-13 05:54:28.780905: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-05-13 05:54:28.782402: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2024-05-13 05:54:28.790266: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2024-05-13 05:54:29.959001: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT [05:54:37] Auto image downscale factor of 1

ZaychikLiu commented 2 days ago

Since the training process runs on the server, to view the results on the PC (client), we need to forward the server port to the local machine: For example, if the SSH command for the server is: ssh -p 48332 root@region-3.autodl.com The URL for viewing results is http://localhost:7007/ We need to forward the server’s port 7007 to the client’s port 7007, but the connection port remains as specified in the SSH command. Enter the following in cmd or PowerShell: ssh -CNg -L 7007:127.0.0.1:7007 root@china.autodl.com -p 11451 Here, china.autodl.com acts like an IP address, and if required, the login password is the server's password.