graphdeco-inria / hierarchical-3d-gaussians

Official implementation of the SIGGRAPH 2024 paper "A Hierarchical 3D Gaussian Representation for Real-Time Rendering of Very Large Datasets"
Other
935 stars 87 forks source link

RuntimeError: CUDA error: invalid configuration argument During render_hierarchy #60

Open leo9344 opened 2 months ago

leo9344 commented 2 months ago

Hello everyone!

I am trying to run the render_hierarchy.py using Wayve101-Scene004 data then I meet an error:


Rendering ${DATASET_DIR}
------------LLFF HOLD-------------
Reading camera 1000/1000
125 test images
875 train images
Making Training Dataset
Making Test Dataset
WARNING: NO ANCHORS FOUND
No exposure to be loaded at ${DATASET_DIR}/exposure.json
  0%|                                                                                                                                                                                                                                | 0/125 [00:00<?, ?it/s][ INFO ] Encountered quite large input images (>1.6K pixels width), rescaling to 1.6K.
 If this is not desired, please explicitly specify '--resolution/-r' as 1
  0%|                                                                                                                                                                                                                                | 0/125 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/path/to/code/hierarchical-3d-gaussians/render_hierarchy.py", line 140, in <module>
    render_set(args, scene, pipe, os.path.join(args.out_dir, f"render_{tau}"), tau, args.eval)
  File "/path/to/anaconda3/envs/h3dgs/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/code/hierarchical-3d-gaussians/render_hierarchy.py", line 82, in render_set
    image = torch.clamp(render_post(
                        ^^^^^^^^^^^^
  File "/path/to/code/hierarchical-3d-gaussians/gaussian_renderer/__init__.py", line 158, in render_post
    screenspace_points = torch.zeros_like(pc.get_xyz, dtype=pc.get_xyz.dtype, requires_grad=True, device="cuda") + 0
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: CUDA error: invalid configuration argument
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

By the way, the full_train.py seems work correctly in my environment and it gives the outputs like this:

outputs
│  cameras.json
│  input.ply
│  merged.hier
│
└─scaffold
    │  cameras.json
    │  cfg_args
    │  exposure.json
    │  input.ply
    │
    └─point_cloud
        └─iteration_30000
                pc_info.txt
                point_cloud.ply

Environments: OS: Ubuntu 20.04.4 LTS GPU: NVIDIA A800 Driver Version: 535.183.01 CUDA Version: 12.2 Python Version: 3.12.4

Pip packages versions:

Package                     Version
--------------------------- ------------
aiofiles                    23.2.1
altair                      5.4.0
annotated-types             0.7.0
anyio                       4.4.0
attrs                       24.2.0
certifi                     2024.7.4
charset-normalizer          3.3.2
click                       8.1.7
contourpy                   1.2.1
cycler                      0.12.1
diff_gaussian_rasterization 0.0.0
exif                        1.6.0
fastapi                     0.112.1
ffmpy                       0.4.0
filelock                    3.13.1
fonttools                   4.53.1
fsspec                      2024.2.0
gaussian_hierarchy          0.0.0
gradio                      4.29.0
gradio_client               0.16.1
gradio_imageslider          0.0.20
h11                         0.14.0
httpcore                    1.0.5
httpx                       0.27.0
huggingface-hub             0.24.5
idna                        3.7
importlib_resources         6.4.3
Jinja2                      3.1.3
joblib                      1.4.2
jsonschema                  4.23.0
jsonschema-specifications   2023.12.1
kiwisolver                  1.4.5
markdown-it-py              3.0.0
MarkupSafe                  2.1.5
matplotlib                  3.9.2
mdurl                       0.1.2
mpmath                      1.3.0
narwhals                    1.4.2
networkx                    3.2.1
numpy                       1.26.3
nvidia-cublas-cu12          12.1.3.1
nvidia-cuda-cupti-cu12      12.1.105
nvidia-cuda-nvrtc-cu12      12.1.105
nvidia-cuda-runtime-cu12    12.1.105
nvidia-cudnn-cu12           8.9.2.26
nvidia-cufft-cu12           11.0.2.54
nvidia-curand-cu12          10.3.2.106
nvidia-cusolver-cu12        11.4.5.107
nvidia-cusparse-cu12        12.1.0.106
nvidia-nccl-cu12            2.20.5
nvidia-nvjitlink-cu12       12.1.105
nvidia-nvtx-cu12            12.1.105
opencv-python               4.9.0.80
orjson                      3.10.7
packaging                   24.1
pandas                      2.2.2
pillow                      10.2.0
pip                         24.2
plum-py                     0.8.7
plyfile                     1.1
pydantic                    2.8.2
pydantic_core               2.20.1
pydub                       0.25.1
Pygments                    2.18.0
pyparsing                   3.1.2
python-dateutil             2.9.0.post0
python-multipart            0.0.9
pytz                        2024.1
PyYAML                      6.0.2
referencing                 0.35.1
requests                    2.32.3
rich                        13.7.1
rpds-py                     0.20.0
ruff                        0.6.1
scikit-learn                1.5.1
scipy                       1.14.0
semantic-version            2.10.0
setuptools                  72.1.0
shellingham                 1.5.4
simple_knn                  0.0.0
six                         1.16.0
sniffio                     1.3.1
starlette                   0.38.2
sympy                       1.12
threadpoolctl               3.5.0
timm                        0.4.5
tomlkit                     0.12.0
torch                       2.3.0+cu121
torchaudio                  2.3.0+cu121
torchvision                 0.18.0+cu121
tqdm                        4.66.5
typer                       0.12.4
typing_extensions           4.12.2
tzdata                      2024.1
urllib3                     2.2.2
uvicorn                     0.30.6
websockets                  11.0.3
wheel                       0.43.0

Does anyone have any idea about this error? If more detailed information is needed, please let me know.

Thanks for your help!