VITA-Group / FSGS

[ECCV 2024]"FSGS: Real-Time Few-Shot View Synthesis using Gaussian Splatting", Zehao Zhu*, Zhiwen Fan*, Yifan Jiang, Zhangyang Wang
Other
418 stars 34 forks source link

"visibility_filter": radii > 0, RuntimeError: CUDA error: an illegal memory access was encountered #37

Closed chenkangjie1123 closed 9 months ago

chenkangjie1123 commented 9 months ago

(FSGS) 23ckj@amax:/mnt1/ckj/FSGS/FSGS-main$ CUDA_VISIBLE_DEVICES=7 CUDA_LAUNCH_BLOCKING=1 python train.py --source_path /mnt1/ckj/gaussian-splatting/tandt_db/nerf_llff_data/horns --model_path output/horns --eval --n_views 3 --sample_pseudo_interval 1 /mnt1/ckj/miniconda/envs/FSGS/lib/python3.7/site-packages/timm/models/_factory.py:121: UserWarning: Mapping deprecated model name vit_base_resnet50_384 to current vit_base_r50_s16_384.orig_in21k_ft_in1k. **kwargs, [1000, 2000, 3000, 5000, 10000] Optimizing output/horns Output folder: output/horns [04/03 21:00:20] 2024-03-04 21:00:20.653428: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F AVX512_VNNI FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2024-03-04 21:00:20.856054: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0. 2024-03-04 21:00:21.587124: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /mnt1/ckj/miniconda/envs/FSGS/lib/python3.7/site-packages/cv2/../../lib64: 2024-03-04 21:00:21.587234: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /mnt1/ckj/miniconda/envs/FSGS/lib/python3.7/site-packages/cv2/../../lib64: 2024-03-04 21:00:21.587246: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. Reading camera 62/62 [04/03 21:00:22] 6.323975610733033 cameras_extent [04/03 21:00:22] Loading Training Cameras [04/03 21:00:22] 3it [00:03, 1.28s/it] Loading Test Cameras [04/03 21:00:26] 8it [00:00, 12.25it/s] Number of points at initialisation : 19397 [04/03 21:00:34] Training progress: 1%|█ | 100/10000 [00:02<04:35, 35.97it/s, Loss=0.3308100]Traceback (most recent call last): File "train.py", line 280, in training(lp.extract(args), op.extract(args), pp.extract(args), args) File "train.py", line 90, in training render_pkg = render(viewpoint_cam, gaussians, pipe, background) File "/mnt1/ckj/FSGS/FSGS-main/gaussian_renderer/init.py", line 131, in render "visibility_filter": radii > 0, RuntimeError: CUDA error: an illegal memory access was encountered Training progress: 1%|█ | 100/10000 [00:03<05:10, 31.90it/s, Loss=0.3308100]

I have tried several methods to solve it, but unfortunately i failed. The method https://github.com/graphdeco-inria/diff-gaussian-rasterization/pull/10 and https://github.com/graphdeco-inria/gaussian-splatting/issues/41#issuecomment-1784246821 just do not work for me.

chenkangjie1123 commented 9 months ago

cuda11.8

chenkangjie1123 commented 9 months ago

and there is no issue with gaussian-splatting source code

chenkangjie1123 commented 9 months ago

ok, I sovle it. I used the Gaussian environment directly several times before and changed the diff package on that basis. I reconfigured it again today and changed the python version from 3.7 to 3.8, and then it worked. .

2693748650 commented 4 months ago

Hi! Could you please explain in detail how to solve this problem? I also reconfigured it, but I still have the same problem, and I am struggling to know how to solve it. Thanks