IDEA-Research / Grounded-Segment-Anything

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
https://arxiv.org/abs/2401.14159
Apache License 2.0
14.23k stars 1.31k forks source link

run grounded_sam_3d_box.ipynb AssertionError: Unexpected error, library: 'cumm' wasn't imported properly. #440

Open keeper-jie opened 5 months ago

keeper-jie commented 5 months ago
  1. build project as flowed:
    make build-image
    make run

    2.use vscode ssh to server machine and run grounded_sam_3d_box.ipynb

    
    # ModuleNotFoundError: No module named 'spconv'
    !pip install spconv

ModuleNotFoundError: No module named 'easydict'

!pip install easydict

3.download `sam_vit_b_01ec64.pth` from https://huggingface.co/spaces/jbrinkma/segment-anything/blob/main/sam_vit_b_01ec64.pth  and `voxelnext_nuscenes_kernel1.pth` from https://drive.google.com/file/d/17mQRXXUsaD0dlRzAKep3MQjfj8ugDsp9/view and 
`points_demo.npy` from https://drive.google.com/file/d/1br0VDamameu7B1G1p4HEjj6LshGs5dHB/view

and put them into `data` directory
4. modify code as

point_dict = {"points": np.load("./data/points_demo.npy")}

5. error occur when run `mask, box3d = model(image, point_dict, prompt_point, lidar2img_rt, image_id)`

AssertionError Traceback (most recent call last) /home/appuser/working_dir/grounded_sam_3d_box.ipynb Cell 8 line 3 1 image = cv2.imread(image_path) 2 prompt_point = np.array([[560, 500]]) ----> 3 mask, box3d = model(image, point_dict, prompt_point, lidar2img_rt, image_id) 4 if not box3d is None: 5 image = _draw_3dbox(box3d, lidar2img_rt, image, mask)

File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, *kwargs) 1190 # If we don't have any hooks, we want to skip the rest of the logic in 1191 # this function, and just call forward. 1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks 1193 or _global_forward_hooks or _global_forward_pre_hooks): -> 1194 return forward_call(input, **kwargs) 1195 # Do not call functions when jit is used 1196 full_backward_hooks, non_full_backward_hooks = [], []

File /home/appuser/working_dir/voxelnext_3d_box/model.py:120, in Model.forward(self, image, point_dict, prompt_point, lidar2img_rt, image_id, quality_score) 118 def forward(self, image, point_dict, prompt_point, lidar2img_rt, image_id, quality_score=0.1): 119 self.image_embedding(image) --> 120 pred_dicts, voxel_coords = self.point_embedding(point_dict, imageid) 122 masks, scores, = self.sam_predictor.predict(point_coords=prompt_point, point_labels=np.array([1])) 123 mask = masks[0]

File /home/appuser/working_dir/voxelnext_3d_box/model.py:60, in Model.point_embedding(self, data_dict, image_id) 59 def point_embedding(self, data_dict, image_id): ---> 60 data_dict = self.voxelnext.data_processor.forward( 61 data_dict=data_dict 62 ) 63 data_dict['voxels'] = torch.Tensor(data_dict['voxels']).to(self.device) 64 data_dict['voxel_num_points'] = torch.Tensor(data_dict['voxel_num_points']).to(self.device)

File /home/appuser/working_dir/voxelnext_3d_box/models/data_processor.py:209, in DataProcessor.forward(self, data_dict) 197 """ 198 Args: 199 data_dict: (...) 205 Returns: 206 """ 208 for cur_processor in self.data_processor_queue: --> 209 data_dict = cur_processor(data_dict=data_dict) 211 return data_dict

File /home/appuser/working_dir/voxelnext_3d_box/models/data_processor.py:148, in DataProcessor.transform_points_to_voxels(self, data_dict, config) 139 self.voxel_generator = VoxelGeneratorWrapper( 140 vsize_xyz=config.VOXEL_SIZE, 141 coors_range_xyz=self.point_cloud_range, (...) 144 max_num_voxels=config.MAX_NUMBER_OF_VOXELS[self.mode], 145 ) 147 points = data_dict['points'] --> 148 voxel_output = self.voxel_generator.generate(points) 149 voxels, coordinates, num_points = voxel_output 151 data_dict['voxels'] = voxels

File /home/appuser/working_dir/voxelnext_3d_box/models/data_processor.py:56, in VoxelGeneratorWrapper.generate(self, points) 54 voxels, coordinates, num_points = voxel_output 55 else: ---> 56 assert tv is not None, f"Unexpected error, library: 'cumm' wasn't imported properly." 57 voxel_output = self._voxel_generator.point_to_voxel(tv.from_numpy(points)) 58 tv_voxels, tv_coordinates, tv_num_points = voxel_output

AssertionError: Unexpected error, library: 'cumm' wasn't imported properly.


# I have search cumm in https://github.com/dvlab-research/VoxelNeXt issue, but no solution about it. It seems like cumm version, my cumm version is  0.4.11
## pip list :

Package Version Editable project location


accelerate 0.26.1 addict 2.4.0 aiofiles 23.2.1 altair 5.2.0 annotated-types 0.6.0 anyio 4.2.0 asttokens 2.0.5 astunparse 1.6.3 attrs 23.2.0 backcall 0.2.0 beautifulsoup4 4.11.1 brotlipy 0.7.0 ccimport 0.4.2 certifi 2022.9.24 cffi 1.15.1 chardet 4.0.0 charset-normalizer 2.0.4 click 8.1.7 colorama 0.4.6 coloredlogs 15.0.1 conda 22.11.1 conda-build 3.23.3 conda-package-handling 1.9.0 contourpy 1.2.0 cryptography 38.0.1 cumm 0.4.11 cycler 0.12.1 debugpy 1.8.0 decorator 5.1.1 defusedxml 0.7.1 diffusers 0.15.1 distro 1.9.0 dnspython 2.2.1 easydict 1.11 exceptiongroup 1.0.4 executing 0.8.3 expecttest 0.1.4 fastapi 0.109.0 ffmpy 0.3.1 filelock 3.6.0 fire 0.5.0 flatbuffers 23.5.26 flit_core 3.6.0 fonttools 4.47.2 fsspec 2023.12.2 future 0.18.2 glob2 0.7 gradio 4.16.0 gradio_client 0.8.1 groundingdino 0.1.0 /home/appuser/Grounded-Segment-Anything/GroundingDINO h11 0.14.0 httpcore 1.0.2 httpx 0.26.0 huggingface-hub 0.20.3 humanfriendly 10.0 hypothesis 6.61.0 idna 3.4 importlib-metadata 7.0.1 importlib-resources 6.1.1 ipykernel 6.16.2 ipython 8.7.0 jedi 0.18.1 Jinja2 2.11.3 jsonschema 4.21.1 jsonschema-specifications 2023.12.1 jupyter_client 8.6.0 jupyter_core 5.7.1 kiwisolver 1.4.5 lark 1.1.9 libarchive-c 2.9 markdown-it-py 3.0.0 MarkupSafe 2.0.1 matplotlib 3.6.0 matplotlib-inline 0.1.6 mdurl 0.1.2 mkl-fft 1.3.1 mkl-random 1.2.2 mkl-service 2.4.0 mpmath 1.2.1 nest-asyncio 1.6.0 ninja 1.11.1.1 numpy 1.26.3 onnx 1.13.1 onnxruntime 1.14.1 openai 1.10.0 opencv-python 4.7.0.72 opencv-python-headless 4.9.0.80 orjson 3.9.12 packaging 23.2 pandas 2.2.0 parso 0.8.3 pccm 0.4.11 pexpect 4.8.0 pickleshare 0.7.5 Pillow 9.3.0 pip 22.3.1 pkginfo 1.8.3 platformdirs 4.1.0 pluggy 1.0.0 portalocker 2.8.2 prompt-toolkit 3.0.20 protobuf 3.20.3 psutil 5.9.0 ptyprocess 0.7.0 pure-eval 0.2.2 pybind11 2.11.1 pycocotools 2.0.6 pycosat 0.6.4 pycparser 2.21 pydantic 2.5.3 pydantic_core 2.14.6 pydub 0.25.1 Pygments 2.17.2 pyOpenSSL 22.0.0 pyparsing 3.1.1 PySocks 1.7.1 python-dateutil 2.8.2 python-etcd 0.4.5 python-multipart 0.0.6 pytz 2022.1 PyYAML 6.0 pyzmq 25.1.2 referencing 0.33.0 regex 2023.12.25 requests 2.28.1 rich 13.7.0 rpds-py 0.17.1 ruamel.yaml 0.17.21 ruamel.yaml.clib 0.2.6 ruff 0.1.14 safetensors 0.4.2 scipy 1.12.0 segment-anything 1.0 /home/appuser/Grounded-Segment-Anything/segment_anything semantic-version 2.10.0 setuptools 65.5.0 shellingham 1.5.4 six 1.16.0 sniffio 1.3.0 sortedcontainers 2.4.0 soupsieve 2.3.2.post1 spconv 2.3.6 stack-data 0.2.0 starlette 0.35.1 supervision 0.18.0 sympy 1.11.1 termcolor 2.4.0 timm 0.9.12 tokenizers 0.15.1 toml 0.10.2 tomli 2.0.1 tomlkit 0.12.0 toolz 0.12.0 torch 1.13.1 torchelastic 0.2.2 torchtext 0.14.1 torchvision 0.14.1 tornado 6.4 tqdm 4.64.1 traitlets 5.7.1 transformers 4.37.1 typer 0.9.0 types-dataclasses 0.6.6 typing_extensions 4.9.0 tzdata 2023.4 urllib3 1.26.13 uvicorn 0.27.0 wcwidth 0.2.5 websockets 11.0.3 wheel 0.37.1 yapf 0.40.2 zipp 3.17.0

## nvcc -V

nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2022 NVIDIA Corporation Built on Tue_Mar__8_18:18:20_PST_2022 Cuda compilation tools, release 11.6, V11.6.124 Build cuda_11.6.r11.6/compiler.31057947_0

## nvidia-smi

Mon Jan 29 12:39:50 2024
+---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.146.02 Driver Version: 535.146.02 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA GeForce RTX 3090 Off | 00000000:01:00.0 Off | N/A | | 0% 25C P8 28W / 350W | 19630MiB / 24576MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| +---------------------------------------------------------------------------------------+

rentainhe commented 5 months ago

You may refer to this repo for better usage about this demo: https://github.com/dvlab-research/3D-Box-Segment-Anything