NVIDIAGameWorks / kaolin

A PyTorch Library for Accelerating 3D Deep Learning Research
Apache License 2.0
4.31k stars 534 forks source link

question about setup error #673

Open gushengbo opened 1 year ago

gushengbo commented 1 year ago

kaolin/csrc/ops/conversions/mesh_to_spc/mesh_to_spc_cuda.cu:21:10: fatal error: cub/device/device_scan.cuh: No such file or directory

include <cub/device/device_scan.cuh>

      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
Caenorst commented 1 year ago

Hi @gushengbo please provide more information about your system / config and the full installation logs

gushengbo commented 1 year ago

Hi, I am the questioner of the previous question, now I successfully install kaolin==0.12.0 with ninja, but I get another bug when I run the code.

(icon) shengbo@gaia:~/ICON-master$ python -m apps.train -cfg ./configs/train/icon-filter.yaml -test PyMeshLab 0.1.7 based on MeshLab 2020.12d mesh............. True ICON: w/ Global Image Encoder: True Image Features used by MLP: ['normal_F', 'normal_B'] Geometry Features used by MLP: ['sdf', 'cmap', 'norm', 'vis'] Dim of Image Features (local): 6 Dim of Geometry Features (ICON): 7 Dim of MLP's first layer: 13

GPU available: True, used: True TPU available: None, using: 0 TPU cores Resume MLP weights from ./data/ckpt/icon-filter.ckpt Resume normal model from ./data/ckpt/normal.ckpt load from ./data/cape/test.txt LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2,3] cuda::::::: True Testing: 0it [00:00, ?it/s]Traceback (most recent call last): File "/home/shengbo/anaconda3/envs/icon/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/shengbo/anaconda3/envs/icon/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/shengbo/ICON-master/apps/train.py", line 144, in trainer.test(model=model, datamodule=datamodule) File "/home/shengbo/anaconda3/envs/icon/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 915, in test results = self.test_given_model(model, test_dataloaders) File "/home/shengbo/anaconda3/envs/icon/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 973, in test_given_model results = self.fit(model) File "/home/shengbo/anaconda3/envs/icon/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 499, in fit self.dispatch() File "/home/shengbo/anaconda3/envs/icon/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 540, in dispatch self.accelerator.start_testing(self) File "/home/shengbo/anaconda3/envs/icon/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 76, in start_testing self.training_type_plugin.start_testing(trainer) File "/home/shengbo/anaconda3/envs/icon/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 118, in start_testing self._results = trainer.run_test() File "/home/shengbo/anaconda3/envs/icon/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 786, in run_test eval_loopresults, = self.run_evaluation() File "/home/shengbo/anaconda3/envs/icon/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 725, in run_evaluation output = self.evaluation_loop.evaluation_step(batch, batch_idx, dataloader_idx) File "/home/shengbo/anaconda3/envs/icon/lib/python3.8/site-packages/pytorch_lightning/trainer/evaluation_loop.py", line 160, in evaluation_step output = self.trainer.accelerator.test_step(args) File "/home/shengbo/anaconda3/envs/icon/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 195, in test_step return self.training_type_plugin.test_step(args) File "/home/shengbo/anaconda3/envs/icon/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 134, in test_step return self.lightning_module.test_step(args, *kwargs) File "/home/shengbo/ICON-master/apps/ICON.py", line 572, in test_step sdf = self.reconEngine(opt=self.cfg, File "/home/shengbo/anaconda3/envs/icon/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(input, kwargs) File "/home/shengbo/ICON-master/lib/common/seg3d_lossless.py", line 147, in forward return self._forward_faster(kwargs) File "/home/shengbo/ICON-master/lib/common/seg3d_lossless.py", line 169, in _forward_faster occupancys = self.batch_eval(coords, kwargs) File "/home/shengbo/ICON-master/lib/common/seg3d_lossless.py", line 138, in batch_eval occupancys = self.query_func(kwargs, points=coords2D) File "/home/shengbo/ICON-master/lib/common/train_util.py", line 434, in query_func preds = netG.query(features=features, File "/home/shengbo/ICON-master/lib/net/HGPIFuNet.py", line 307, in query point_feat_out = point_feat_extractor.query( File "/home/shengbo/ICON-master/lib/dataset/PointFeat.py", line 44, in query residues, ptsind, = point_to_mesh_distance(points, self.triangles) File "/home/shengbo/ICON-master/kaolin/kaolin/metrics/trianglemesh.py", line 81, in point_to_mesh_distance cur_dist, cur_face_idx, cur_dist_type = _UnbatchedTriangleDistanceCuda.apply( File "/home/shengbo/ICON-master/kaolin/kaolin/metrics/trianglemesh.py", line 125, in forward _C.metrics.unbatched_triangle_distance_forward_cuda( RuntimeError: unbatched_triangle_distance not built with CUDA

nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2019 NVIDIA Corporation Built on Wed_Oct_23_19:24:38_PDT_2019 Cuda compilation tools, release 10.2, V10.2.89

kaolin 0.12.0 /home/shengbo/ICON-master/kaolin torch 1.12.1+cu102 torchaudio 0.12.1+cu102 torchmetrics 0.11.0 torchvision 0.13.1+cu102

I am in cluster.

Caenorst commented 1 year ago

Hi @gushengbo , I don't think you installed kaolin properly, can you please provide the installation logs?