YuxueYang1204 / TrimGS

Trim 3D Gaussian Splatting for Accurate Geometry Representation
https://trimgs.github.io/
189 stars 4 forks source link

The DTU dataset is not working for training. #9

Closed jhee-han closed 2 months ago

jhee-han commented 3 months ago

When I ran python scripts/run_dtu.py, no error occurred, but it did not work either. I could see no activity in nvidia-smi.

Previously, I encountered the following errors:

File "/hdd/jhee/3D/TrimGS/Trim3DGS/scene/dataset_readers.py", line 24, in <module> from scene.gaussian_model import BasicPointCloud File "/hdd/jhee/3D/TrimGS/Trim3DGS/scene/gaussian_model.py", line 19, in <module> from pytorch3d.ops import knn_points File "/home/jhee/miniconda3/envs/trimgs/lib/python3.8/site-packages/pytorch3d/ops/__init__.py", line 5, in <module> from .graph_conv import GraphConv File "/home/jhee/miniconda3/envs/trimgs/lib/python3.8/site-packages/pytorch3d/ops/graph_conv.py", line 8, in <module> from pytorch3d import _C ImportError: libcudart.so.10.1: cannot open shared object file: No such file or directory `(trimgs) jhee@vsclab04:/hdd/jhee/3D/TrimGS/Trim3DGS$ python scripts/run_dtu.py Starting job on GPU 0 with scene 24

OMP_NUM_THREADS=4 CUDA_VISIBLE_DEVICES=0 python train.py -s data/dtu_dataset/DTU/scan24 -m output/DTU_3DGS/scan24 -r 2
Traceback (most recent call last):
File "train.py", line 16, in
from gaussian_renderer import render, network_gui
File "/hdd/jhee/3D/TrimGS/Trim3DGS/gaussian_renderer/init.py", line 14, in
from diff_gaussian_rasterization import GaussianRasterizationSettings, GaussianRasterizer
File "/home/jhee/miniconda3/envs/trimgs/lib/python3.8/site-packages/diff_gaussian_rasterization/init.py", line 15, in
from . import _C
ImportError: /home/jhee/miniconda3/envs/trimgs/lib/python3.8/site-packages/torch/lib/libtorch_cuda_cpp.so: undefined symbol: _ZTIN4c10d12ProcessGroup4WorkE
Job (0, (24, 2)) has finished., rellasing GPU 0
Starting job on GPU 0 with scene 37

OMP_NUM_THREADS=4 CUDA_VISIBLE_DEVICES=0 python train.py -s data/dtu_dataset/DTU/scan37 -m output/DTU_3DGS/scan37 -r 2
Traceback (most recent call last):
File "train.py", line 16, in
from gaussian_renderer import render, network_gui
File "/hdd/jhee/3D/TrimGS/Trim3DGS/gaussian_renderer/init.py", line 14, in
from diff_gaussian_rasterization import GaussianRasterizationSettings, GaussianRasterizer
File "/home/jhee/miniconda3/envs/trimgs/lib/python3.8/site-packages/diff_gaussian_rasterization/init.py", line 15, in
from . import _C
ImportError: /home/jhee/miniconda3/envs/trimgs/lib/python3.8/site-packages/torch/lib/libtorch_cuda_cpp.so: undefined symbol: _ZTIN4c10d12ProcessGroup4WorkE
Job (0, (37, 2)) has finished., rellasing GPU 0
^CTraceback (most recent call last):
File "scripts/run_dtu.py", line 81, in dispatch_jobs(jobs, executor) File "scripts/run_dtu.py", line 74, in dispatch_jobs time.sleep(5) KeyboardInterrupt`

(trimgs) (base) jhee@vsclab04:/hdd/jhee/3D/TrimGS$ conda list
# packages in environment at /home/jhee/miniconda3/envs/trimgs:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main  
_openmp_mutex             5.1                       1_gnu  
blas                      1.0                         mkl  
brotli-python             1.0.9            py38h6a678d5_8  
bzip2                     1.0.8                h5eee18b_6  
ca-certificates           2024.3.11            h06a4308_0  
certifi                   2024.6.2         py38h06a4308_0  
charset-normalizer        3.3.2                    pypi_0    pypi
cudatoolkit               10.1.243             h6bb024c_0  
diff-gaussian-rasterization 0.0.0                    pypi_0    pypi
ffmpeg                    4.3                  hf484d3e_0    pytorch
filelock                  3.15.4                   pypi_0    pypi
freetype                  2.12.1               h4a9f257_0  
fsspec                    2024.6.1                 pypi_0    pypi
fvcore                    0.1.5.post20221221          pypi_0    pypi
gmp                       6.2.1                h295c915_3  
gmpy2                     2.1.2            py38heeb90bb_0  
gnutls                    3.6.15               he1e5248_0  
idna                      3.7              py38h06a4308_0  
intel-openmp              2023.1.0         hdb19cb5_46306  
iopath                    0.1.10                   pypi_0    pypi
jinja2                    3.1.4            py38h06a4308_0  
jpeg                      9e                   h5eee18b_1  
lame                      3.100                h7b6447c_0  
lcms2                     2.12                 h3be6417_0  
ld_impl_linux-64          2.38                 h1181459_1  
lerc                      3.0                  h295c915_0  
libdeflate                1.17                 h5eee18b_1  
libffi                    3.4.4                h6a678d5_1  
libgcc-ng                 11.2.0               h1234567_1  
libgomp                   11.2.0               h1234567_1  
libiconv                  1.16                 h5eee18b_3  
libidn2                   2.3.4                h5eee18b_0  
libjpeg-turbo             2.0.0                h9bf148f_0    pytorch
libpng                    1.6.39               h5eee18b_0  
libstdcxx-ng              11.2.0               h1234567_1  
libtasn1                  4.19.0               h5eee18b_0  
libtiff                   4.5.1                h6a678d5_0  
libunistring              0.9.10               h27cfd23_0  
libwebp-base              1.3.2                h5eee18b_0  
llvm-openmp               14.0.6               h9e868ea_0  
lz4-c                     1.9.4                h6a678d5_1  
markupsafe                2.1.5                    pypi_0    pypi
mkl                       2023.1.0         h213fc3f_46344  
mkl-service               2.4.0            py38h5eee18b_1  
mkl_fft                   1.3.8            py38h5eee18b_0  
mkl_random                1.2.4            py38hdb19cb5_0  
mpc                       1.1.0                h10f8cd9_1  
mpfr                      4.0.2                hb69a4c5_1  
mpmath                    1.3.0            py38h06a4308_0  
ncurses                   6.4                  h6a678d5_0  
nettle                    3.7.3                hbbd107a_1  
networkx                  3.1              py38h06a4308_0  
ninja                     1.11.1.1                 pypi_0    pypi
numpy                     1.24.4                   pypi_0    pypi
numpy-base                1.24.3           py38h060ed82_1  
nvidia-cublas-cu12        12.1.3.1                 pypi_0    pypi
nvidia-cuda-cupti-cu12    12.1.105                 pypi_0    pypi
nvidia-cuda-nvrtc-cu12    12.1.105                 pypi_0    pypi
nvidia-cuda-runtime-cu12  12.1.105                 pypi_0    pypi
nvidia-cudnn-cu12         8.9.2.26                 pypi_0    pypi
nvidia-cufft-cu12         11.0.2.54                pypi_0    pypi
nvidia-curand-cu12        10.3.2.106               pypi_0    pypi
nvidia-cusolver-cu12      11.4.5.107               pypi_0    pypi
nvidia-cusparse-cu12      12.1.0.106               pypi_0    pypi
nvidia-nccl-cu12          2.20.5                   pypi_0    pypi
nvidia-nvjitlink-cu12     12.5.40                  pypi_0    pypi
nvidia-nvtx-cu12          12.1.105                 pypi_0    pypi
opencv-python-headless    4.10.0.84                pypi_0    pypi
openh264                  2.1.1                h4ff587b_0  
openjpeg                  2.4.0                h3ad879b_0  
openssl                   3.0.14               h5eee18b_0  
pillow                    10.3.0           py38h5eee18b_0  
pip                       24.0             py38h06a4308_0  
plyfile                   1.0.3                    pypi_0    pypi
portalocker               2.10.0                   pypi_0    pypi
pysocks                   1.7.1            py38h06a4308_0  
python                    3.8.19               h955ad1f_0  
pytorch                   2.3.1               py3.8_cpu_0    pytorch
pytorch-mutex             1.0                         cpu    pytorch
pytorch3d                 0.3.0                    pypi_0    pypi
pyyaml                    6.0.1            py38h5eee18b_0  
readline                  8.2                  h5eee18b_0  
requests                  2.32.3                   pypi_0    pypi
setuptools                69.5.1           py38h06a4308_0  
simple-knn                0.0.0                    pypi_0    pypi
sqlite                    3.45.3               h5eee18b_0  
sympy                     1.12.1                   pypi_0    pypi
tabulate                  0.9.0                    pypi_0    pypi
tbb                       2021.8.0             hdb19cb5_0  
termcolor                 2.4.0                    pypi_0    pypi
tk                        8.6.14               h39e8969_0  
torch                     1.12.1+cu113             pypi_0    pypi
torchaudio                0.12.1+cu113             pypi_0    pypi
torchvision               0.13.1+cu113             pypi_0    pypi
tqdm                      4.66.4                   pypi_0    pypi
triton                    2.3.1                    pypi_0    pypi
typing-extensions         4.12.2                   pypi_0    pypi
typing_extensions         4.11.0           py38h06a4308_0  
urllib3                   2.2.2            py38h06a4308_0  
wheel                     0.43.0           py38h06a4308_0  
xz                        5.4.6                h5eee18b_1  
yacs                      0.1.8                    pypi_0    pypi
yaml                      0.2.5                h7b6447c_0  
zlib                      1.2.13               h5eee18b_1  
zstd                      1.5.5                hc292b87_2  
YuxueYang1204 commented 3 months ago

It seems like you might have installed the CPU version of PyTorch. It looks like the error you're encountering is due to libcudart.so.10.1: cannot open shared object file: No such file or directory. This issue is likely related to the configuration of PyTorch3D. I recommend checking the PyTorch3D documentation or their support channels for a solution. Additionally, here is the partial information about my environment, which I hope might be helpful to you:

pytorch                   2.1.0           py3.8_cuda12.1_cudnn8.9.2_0    pytorch
pytorch-cuda              12.1                 ha16c6d3_5    pytorch
pytorch-mutex             1.0                        cuda    pytorch
pytorch-scatter           2.1.2           py38_torch_2.1.0_cu121    pyg
pytorch3d                 0.7.5           py38_cu121_pyt210    pytorch3d
Abyssaledge commented 2 months ago

feel free to reopen this issue if you have further questions.