Open gwyllo opened 1 month ago
I think you have to change the GPU_ID (3rd line of the bash script) to 0, since you have only one GPU.
I can successfully install all dependencies following the instruction sets in the repo. I have also tried the same using the docker image.
I get the following error when trying to run the run_train_infer.sh script using either install from scratch or the docker image:
(instantsplat) root@C.13193616:/InstantSplat$ bash scripts/run_train_infer.sh ========= santorini: Dust3r_coarse_geometric_initialization ========= ... loading model from submodules/dust3r/checkpoints/DUSt3R_ViTLarge_BaseDecoder_512_dpt.pth instantiating : AsymmetricCroCo3DStereo(enc_depth=24, dec_depth=12, enc_embed_dim=1024, dec_embed_dim=768, enc_num_heads=16, dec_num_heads=12, pos_embed='RoPE100', patch_embed_cls='PatchEmbedDust3R', img_size=(512, 512), head_type='dpt', output_mode='pts3d', depth_mode=('exp', -inf, inf), conf_mode=('exp', 1, inf), landscape_only=False) <All keys matched successfully> Traceback (most recent call last): File "/InstantSplat/./coarse_init_infer.py", line 53, in <module> model = AsymmetricCroCo3DStereo.from_pretrained(model_path).to(device) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1340, in to return self._apply(convert) ^^^^^^^^^^^^^^^^^^^^ File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/nn/modules/module.py", line 900, in _apply module._apply(fn) File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/nn/modules/module.py", line 900, in _apply module._apply(fn) File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/nn/modules/module.py", line 927, in _apply param_applied = fn(param) ^^^^^^^^^ File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1326, in convert return t.to( ^^^^^ File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/torch/cuda/__init__.py", line 319, in _lazy_init torch._C._cuda_init() RuntimeError: No CUDA GPUs are available ========= santorini: Train: jointly optimize pose ========= Optimizing ./output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/ Output folder: ./output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/ Traceback (most recent call last): File "/InstantSplat/./train_joint.py", line 279, in <module> training(lp.extract(args), op.extract(args), pp.extract(args), args.test_iterations, args.save_iterations, args.checkpoint_iterations, args.start_checkpoint, args.debug_from, args) File "/InstantSplat/./train_joint.py", line 60, in training scene = Scene(dataset, gaussians, opt=args, shuffle=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/InstantSplat/scene/__init__.py", line 49, in __init__ assert False, "Could not recognize scene type!" ^^^^^ AssertionError: Could not recognize scene type! ========= santorini: Render interpolated pose & output video ========= Looking for config file in ./output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/cfg_args Config file found: ./output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/cfg_args Rendering ./output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/ Traceback (most recent call last): File "/InstantSplat/./render_by_interp.py", line 143, in <module> render_sets( File "/InstantSplat/./render_by_interp.py", line 98, in render_sets save_interpolate_pose(dataset.model_path, iteration, args.n_views) File "/InstantSplat/./render_by_interp.py", line 33, in save_interpolate_pose org_pose = np.load(model_path + f"pose/pose_{iter}.npy") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/conda/envs/instantsplat/lib/python3.11/site-packages/numpy/lib/_npyio_impl.py", line 455, in load fid = stack.enter_context(open(os.fspath(file), "rb")) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ FileNotFoundError: [Errno 2] No such file or directory: './output/infer/sora/santorini/3_views_1000Iter_1xPoseLR/pose/pose_1000.npy'
nvcc --version suggests cuda is installed correctly
nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2023 NVIDIA Corporation Built on Mon_Apr__3_17:16:06_PDT_2023 Cuda compilation tools, release 12.1, V12.1.105 Build cuda_12.1.r12.1/compiler.32688072_0
and a simple script to check if cuda is accessible to pytorch also seems to work as expected within this conda environment:
import torch print (torch.cuda.is_available()) print(torch.version.cuda) print(torch.cuda.device_count())
(instantsplat) root@C.13193616:/$ python torchCheck.py True 12.1 1
Any idea on the underlying cause of this issue?
hi,have you solve this issue?i have met same issue
I can successfully install all dependencies following the instruction sets in the repo. I have also tried the same using the docker image.
I get the following error when trying to run the run_train_infer.sh script using either install from scratch or the docker image:
nvcc --version suggests cuda is installed correctly
and a simple script to check if cuda is accessible to pytorch also seems to work as expected within this conda environment:
Any idea on the underlying cause of this issue?