xxlong0 / SparseNeuS

SparseNeuS: Fast Generalizable Neural Surface Reconstruction from Sparse views
MIT License
319 stars 16 forks source link

RuntimeError: CUDA error: invalid configuration argument. #13

Open softword-tt opened 1 year ago

softword-tt commented 1 year ago

Hi, I want to render with my own data without training. I fowllow Neus/preprocess_custom_data using colmap to prepare my own test data. The results are in the same format with sample_data/scan_114. My command is python exp_runner_finetune.py \ --mode val --conf ./confs/finetune.conf --is_finetune \ --checkpoint_path ./weights/ckpt.pth \ --case_name scan114 --train_imgs_idx 0 1 2 --test_imgs_idx 0 1 2 --near 200 --far 800 \ --visibility_beta 0.010 --visibility_gama 0.010 --visibility_weight_thred 0.7 Without any changes except the case_name, the scan114 could validate successfully, but my own data raises the error: RuntimeError: CUDA error: invalid configuration argument. The output:

detected 8 GPUs base_exp_dir: ./exp/dtu/finetune/DTU/seen_imgs_0_1_2/2022_08_31_12_46_19 [exp_runner_finetune.py:177 - init() ] Find checkpoint: ./weights/ckpt.pth sdf_network_lod1 load fails [exp_runner_finetune.py:483 - load_checkpoint() ] End /DATA/disk1/epic/yanzhu/miniconda3/envs/nrvgn/lib/python3.9/site-packages/torch/functional.py:568: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2228.) return _VF.meshgrid(tensors, kwargs) # type: ignore[attr-defined] type ncc patch_size 3 beta 0.01 gama 0.01 weight_thred [0.7] [dtu_fit.py:45 - init() ] Load data: Begin [dtu_fit.py:107 - init() ] Load data: End [dtu_fit.py:45 - init() ] Load data: Begin [dtu_fit.py:107 - init() ] Load data: End /DATA/disk1/epic/ciyuruan/SparseNeuS/SparseNeuS/data/dtu_fit.py:238: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor). sample['partial_vol_origin'] = torch.tensor(self.partial_vol_origin, dtype=torch.float32) /DATA/disk1/epic/ciyuruan/SparseNeuS/SparseNeuS/data/dtu_fit.py:238: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor). sample['partial_vol_origin'] = torch.tensor(self.partial_vol_origin, dtype=torch.float32) /DATA/disk1/epic/ciyuruan/SparseNeuS/SparseNeuS/data/dtu_fit.py:238: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor). sample['partial_vol_origin'] = torch.tensor(self.partial_vol_origin, dtype=torch.float32) /DATA/disk1/epic/ciyuruan/SparseNeuS/SparseNeuS/data/dtu_fit.py:238: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor). sample['partial_vol_origin'] = torch.tensor(self.partial_vol_origin, dtype=torch.float32) /DATA/disk1/epic/ciyuruan/SparseNeuS/SparseNeuS/data/dtu_fit.py:209: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor). sample['partial_vol_origin'] = torch.tensor(self.partial_vol_origin, dtype=torch.float32) Traceback (most recent call last): File "/DATA/disk1/epic/ciyuruan/SparseNeuS/SparseNeuS/exp_runner_finetune.py", line 596, in runner = Runner(args.conf, args.mode, args.is_continue, File "/DATA/disk1/epic/ciyuruan/SparseNeuS/SparseNeuS/exp_runner_finetune.py", line 202, in init self.initialize_network() File "/DATA/disk1/epic/ciyuruan/SparseNeuS/SparseNeuS/exp_runner_finetune.py", line 283, in initialize_network self.trainer.initialize_finetune_network(sample, train_from_scratch=self.train_from_scratch) File "/DATA/disk1/epic/ciyuruan/SparseNeuS/SparseNeuS/models/trainer_finetune.py", line 231, in initialize_finetune_network con_volume, con_maskvolume, = self.prepare_con_volume(sample) File "/DATA/disk1/epic/yanzhu/miniconda3/envs/nrvgn/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, *kwargs) File "/DATA/disk1/epic/ciyuruan/SparseNeuS/SparseNeuS/models/trainer_finetune.py", line 176, in prepare_con_volume conditional_features_lod0 = self.sdf_network_lod0.get_conditional_volume( File "/DATA/disk1/epic/ciyuruan/SparseNeuS/SparseNeuS/models/sparse_sdf_network.py", line 364, in get_conditional_volume feat = self.sparse_costreg_net(sparse_feat) File "/DATA/disk1/epic/yanzhu/miniconda3/envs/nrvgn/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(input, kwargs) File "/DATA/disk1/epic/ciyuruan/SparseNeuS/SparseNeuS/tsparse/modules.py", line 293, in forward conv0 = self.conv0(x) File "/DATA/disk1/epic/yanzhu/miniconda3/envs/nrvgn/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, kwargs) File "/DATA/disk1/epic/ciyuruan/SparseNeuS/SparseNeuS/tsparse/modules.py", line 107, in forward out = self.net(x) File "/DATA/disk1/epic/yanzhu/miniconda3/envs/nrvgn/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, *kwargs) File "/DATA/disk1/epic/yanzhu/miniconda3/envs/nrvgn/lib/python3.9/site-packages/torch/nn/modules/container.py", line 141, in forward input = module(input) File "/DATA/disk1/epic/yanzhu/miniconda3/envs/nrvgn/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(input, kwargs) File "/home/ciyuruan/.local/lib/python3.9/site-packages/torchsparse/nn/modules/conv.py", line 66, in forward return F.conv3d(input, File "/home/ciyuruan/.local/lib/python3.9/site-packages/torchsparse/nn/functional/conv.py", line 114, in conv3d results = F.sphashquery(queries, references) File "/home/ciyuruan/.local/lib/python3.9/site-packages/torchsparse/nn/functional/query.py", line 21, in sphashquery output = torchsparse.backend.hash_query_cuda(queries, references, RuntimeError: CUDA error: invalid configuration argument

The correct output with scan114: Screenshot from 2022-08-31 12-49-04 I also print the world_mat_0 to check whether there exist a transposation. Screenshot from 2022-08-31 12-32-59 The left is my test_data(seen), the right is the example(scan114).

jcliu0428 commented 1 year ago

Hi I met the same issue when I try to train this framework on ScanNet. Have you solve it? Thanks.

flamehaze1115 commented 1 year ago

Hi, I want to render with my own data without training. I fowllow Neus/preprocess_custom_data using colmap to prepare my own test data. The results are in the same format with sample_data/scan_114. My command is python exp_runner_finetune.py \ --mode val --conf ./confs/finetune.conf --is_finetune \ --checkpoint_path ./weights/ckpt.pth \ --case_name scan114 --train_imgs_idx 0 1 2 --test_imgs_idx 0 1 2 --near 200 --far 800 \ --visibility_beta 0.010 --visibility_gama 0.010 --visibility_weight_thred 0.7 Without any changes except the case_name, the scan114 could validate successfully, but my own data raises the error: RuntimeError: CUDA error: invalid configuration argument. The output:

detected 8 GPUs base_exp_dir: ./exp/dtu/finetune/DTU/seen_imgs_0_1_2/2022_08_31_12_46_19 [exp_runner_finetune.py:177 - init() ] Find checkpoint: ./weights/ckpt.pth sdf_network_lod1 load fails [exp_runner_finetune.py:483 - load_checkpoint() ] End /DATA/disk1/epic/yanzhu/miniconda3/envs/nrvgn/lib/python3.9/site-packages/torch/functional.py:568: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2228.) return _VF.meshgrid(tensors, kwargs) # type: ignore[attr-defined] type ncc patch_size 3 beta 0.01 gama 0.01 weight_thred [0.7] [dtu_fit.py:45 - init() ] Load data: Begin [dtu_fit.py:107 - init() ] Load data: End [dtu_fit.py:45 - init() ] Load data: Begin [dtu_fit.py:107 - init() ] Load data: End /DATA/disk1/epic/ciyuruan/SparseNeuS/SparseNeuS/data/dtu_fit.py:238: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor). sample['partial_vol_origin'] = torch.tensor(self.partial_vol_origin, dtype=torch.float32) /DATA/disk1/epic/ciyuruan/SparseNeuS/SparseNeuS/data/dtu_fit.py:238: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor). sample['partial_vol_origin'] = torch.tensor(self.partial_vol_origin, dtype=torch.float32) /DATA/disk1/epic/ciyuruan/SparseNeuS/SparseNeuS/data/dtu_fit.py:238: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor). sample['partial_vol_origin'] = torch.tensor(self.partial_vol_origin, dtype=torch.float32) /DATA/disk1/epic/ciyuruan/SparseNeuS/SparseNeuS/data/dtu_fit.py:238: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor). sample['partial_vol_origin'] = torch.tensor(self.partial_vol_origin, dtype=torch.float32) /DATA/disk1/epic/ciyuruan/SparseNeuS/SparseNeuS/data/dtu_fit.py:209: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor). sample['partial_vol_origin'] = torch.tensor(self.partial_vol_origin, dtype=torch.float32) Traceback (most recent call last): File "/DATA/disk1/epic/ciyuruan/SparseNeuS/SparseNeuS/exp_runner_finetune.py", line 596, in runner = Runner(args.conf, args.mode, args.is_continue, File "/DATA/disk1/epic/ciyuruan/SparseNeuS/SparseNeuS/exp_runner_finetune.py", line 202, in init self.initialize_network() File "/DATA/disk1/epic/ciyuruan/SparseNeuS/SparseNeuS/exp_runner_finetune.py", line 283, in initialize_network self.trainer.initialize_finetune_network(sample, train_from_scratch=self.train_from_scratch) File "/DATA/disk1/epic/ciyuruan/SparseNeuS/SparseNeuS/models/trainer_finetune.py", line 231, in initialize_finetune_network con_volume, con_maskvolume, = self.prepare_con_volume(sample) File "/DATA/disk1/epic/yanzhu/miniconda3/envs/nrvgn/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, *kwargs) File "/DATA/disk1/epic/ciyuruan/SparseNeuS/SparseNeuS/models/trainer_finetune.py", line 176, in prepare_con_volume conditional_features_lod0 = self.sdf_network_lod0.get_conditional_volume( File "/DATA/disk1/epic/ciyuruan/SparseNeuS/SparseNeuS/models/sparse_sdf_network.py", line 364, in get_conditional_volume feat = self.sparse_costreg_net(sparse_feat) File "/DATA/disk1/epic/yanzhu/miniconda3/envs/nrvgn/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(input, kwargs) File "/DATA/disk1/epic/ciyuruan/SparseNeuS/SparseNeuS/tsparse/modules.py", line 293, in forward conv0 = self.conv0(x) File "/DATA/disk1/epic/yanzhu/miniconda3/envs/nrvgn/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, kwargs) File "/DATA/disk1/epic/ciyuruan/SparseNeuS/SparseNeuS/tsparse/modules.py", line 107, in forward out = self.net(x) File "/DATA/disk1/epic/yanzhu/miniconda3/envs/nrvgn/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, *kwargs) File "/DATA/disk1/epic/yanzhu/miniconda3/envs/nrvgn/lib/python3.9/site-packages/torch/nn/modules/container.py", line 141, in forward input = module(input) File "/DATA/disk1/epic/yanzhu/miniconda3/envs/nrvgn/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(input, kwargs) File "/home/ciyuruan/.local/lib/python3.9/site-packages/torchsparse/nn/modules/conv.py", line 66, in forward return F.conv3d(input, File "/home/ciyuruan/.local/lib/python3.9/site-packages/torchsparse/nn/functional/conv.py", line 114, in conv3d results = F.sphashquery(queries, references) File "/home/ciyuruan/.local/lib/python3.9/site-packages/torchsparse/nn/functional/query.py", line 21, in sphashquery output = torchsparse.backend.hash_query_cuda(queries, references, RuntimeError: CUDA error: invalid configuration argument

The correct output with scan114: Screenshot from 2022-08-31 12-49-04 I also print the world_mat_0 to check whether there exist a transposation. Screenshot from 2022-08-31 12-32-59 The left is my test_data(seen), the right is the example(scan114).

The errors indicate that the sparseconv is not correctly installed.