IDEA-Research / Grounded-SAM-2

Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2
https://arxiv.org/abs/2401.14159
Apache License 2.0
864 stars 72 forks source link

Docker GPU Issues #56

Open bgiffo96 opened 1 week ago

bgiffo96 commented 1 week ago

After setting up Grounded-SAM-2 using the docker container provided and running the demo script:

/home/appuser/Grounded-SAM-2# python grounded_sam2_local_demo.py

I am met with the following warning and error:

UserWarning: Failed to load custom C++ ops. Running on CPU mode Only! UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1716905979055/work/aten/src/ATen/native/TensorShape.cpp:3587.) final text_encoder_type: bert-base-uncased model.safetensors: 100%|██████████| 440M/440M [00:05<00:00, 56.5MB/s] UserWarning: The given NumPy array is not writable, and PyTorch does not support non-writable tensors. This means writing to this tensor will result in undefined behavior. You may want to copy the array to protect its data or make it writable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /opt/conda/conda-bld/pytorch_1716905979055/work/torch/csrc/utils/tensor_numpy.cpp:206.) UserWarning: torch.utils.checkpoint: the use_reentrant parameter should be passed explicitly. In version 2.4 we will raise an exception if use_reentrant is not passed. use_reentrant=False is recommended, but if you need to preserve the current default behavior, you can pass use_reentrant=True. Refer to docs for more details on the differences between the two variants. UserWarning: None of the inputs have requires_grad=True. Gradients will be None Traceback (most recent call last): File "/home/appuser/Grounded-SAM-2/grounded_sam2_local_demo.py", line 58, in boxes, confidences, labels = predict( File "/home/appuser/Grounded-SAM-2/grounding_dino/groundingdino/util/inference.py", line 68, in predict outputs = model(image[None], captions=[caption]) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(*args, *kwargs) File "/home/appuser/Grounded-SAM-2/grounding_dino/groundingdino/models/GroundingDINO/groundingdino.py", line 327, in forward hs, reference, hs_enc, ref_enc, init_box_proposal = self.transformer( File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(*args, kwargs) File "/home/appuser/Grounded-SAM-2/grounding_dino/groundingdino/models/GroundingDINO/transformer.py", line 258, in forward memory, memory_text = self.encoder( File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(args, kwargs) File "/home/appuser/Grounded-SAM-2/grounding_dino/groundingdino/models/GroundingDINO/transformer.py", line 576, in forward output = checkpoint.checkpoint( File "/opt/conda/lib/python3.10/site-packages/torch/_compile.py", line 24, in inner return torch._dynamo.disable(fn, recursive)(*args, kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 451, in _fn return fn(*args, *kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/external_utils.py", line 36, in inner return fn(args, kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 487, in checkpoint return CheckpointFunction.apply(function, preserve, args) File "/opt/conda/lib/python3.10/site-packages/torch/autograd/function.py", line 598, in apply return super().apply(args, kwargs) # type: ignore[misc] File "/opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 262, in forward outputs = run_function(args) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(*args, kwargs) File "/home/appuser/Grounded-SAM-2/grounding_dino/groundingdino/models/GroundingDINO/transformer.py", line 785, in forward src2 = self.self_attn( File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(args, kwargs) File "/home/appuser/Grounded-SAM-2/grounding_dino/groundingdino/models/GroundingDINO/ms_deform_attn.py", line 338, in forward output = MultiScaleDeformableAttnFunction.apply( File "/opt/conda/lib/python3.10/site-packages/torch/autograd/function.py", line 598, in apply return super().apply(*args, **kwargs) # type: ignore[misc] File "/home/appuser/Grounded-SAM-2/grounding_dino/groundingdino/models/GroundingDINO/ms_deform_attn.py", line 53, in forward output = _C.ms_deform_attn_forward( NameError: name '_C' is not defined

The container appears to build with the GPU correctly. Below are the results of test I have ran to find discrepancies:

nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2023 NVIDIA Corporation Built on Mon_Apr__3_17:16:06_PDT_2023 Cuda compilation tools, release 12.1, V12.1.105 Build cuda_12.1.r12.1/compiler.32688072_0

TORCH_VERSION = ".".join(torch.version.split(".")[:2]) CUDA_VERSION = torch.version.split("+")[-1] print("torch: ", TORCH_VERSION, "; cuda: ", CUDA_VERSION) torch: 2.3 ; cuda: 2.3.1

print(torch.cuda.is_available()) True

print(torch.cuda.device_count()) 1

print(torch.cuda.current_device()) 0

print(torch.cuda.device(0)) <torch.cuda.device object at 0x7a10b3552080>

print(torch.cuda.get_device_name(0)) NVIDIA GeForce RTX 4090

I am running the host machine with Ubuntu 24.04 with CUDA 12.6 if that is of any relevance.

I cannot for the life of me figure out what is causing the issue is or the solution to it. Any support on this issue would be greatly appreciated.

bgiffo96 commented 2 days ago

I have tested the dockerfile on another device and recreated the error following the same set up procedure described in the README.

/home/appuser/Grounded-SAM-2# python grounded_sam2_local_demo.py UserWarning: Flash Attention is disabled as it requires a GPU with Ampere (8.0) CUDA capability. UserWarning: Failed to load custom C++ ops. Running on CPU mode Only! UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1716905979055/work/aten/src/ATen/native/TensorShape.cpp:3587.) final text_encoder_type: bert-base-uncased UserWarning: The given NumPy array is not writable, and PyTorch does not support non-writable tensors. This means writing to this tensor will result in undefined behavior. You may want to copy the array to protect its data or make it writable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /opt/conda/conda-bld/pytorch_1716905979055/work/torch/csrc/utils/tensor_numpy.cpp:206.) UserWarning: torch.utils.checkpoint: the use_reentrant parameter should be passed explicitly. In version 2.4 we will raise an exception if use_reentrant is not passed. use_reentrant=False is recommended, but if you need to preserve the current default behavior, you can pass use_reentrant=True. Refer to docs for more details on the differences between the two variants. UserWarning: None of the inputs have requires_grad=True. Gradients will be None Traceback (most recent call last): File "/home/appuser/Grounded-SAM-2/grounded_sam2_local_demo.py", line 58, in boxes, confidences, labels = predict( File "/home/appuser/Grounded-SAM-2/grounding_dino/groundingdino/util/inference.py", line 68, in predict outputs = model(image[None], captions=[caption]) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(*args, *kwargs) File "/home/appuser/Grounded-SAM-2/grounding_dino/groundingdino/models/GroundingDINO/groundingdino.py", line 327, in forward hs, reference, hs_enc, ref_enc, init_box_proposal = self.transformer( File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(*args, kwargs) File "/home/appuser/Grounded-SAM-2/grounding_dino/groundingdino/models/GroundingDINO/transformer.py", line 258, in forward memory, memory_text = self.encoder( File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(args, kwargs) File "/home/appuser/Grounded-SAM-2/grounding_dino/groundingdino/models/GroundingDINO/transformer.py", line 576, in forward output = checkpoint.checkpoint( File "/opt/conda/lib/python3.10/site-packages/torch/_compile.py", line 24, in inner return torch._dynamo.disable(fn, recursive)(*args, kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 451, in _fn return fn(*args, *kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/external_utils.py", line 36, in inner return fn(args, kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 487, in checkpoint return CheckpointFunction.apply(function, preserve, args) File "/opt/conda/lib/python3.10/site-packages/torch/autograd/function.py", line 598, in apply return super().apply(args, kwargs) # type: ignore[misc] File "/opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 262, in forward outputs = run_function(args) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(*args, kwargs) File "/home/appuser/Grounded-SAM-2/grounding_dino/groundingdino/models/GroundingDINO/transformer.py", line 785, in forward src2 = self.self_attn( File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(args, kwargs) File "/home/appuser/Grounded-SAM-2/grounding_dino/groundingdino/models/GroundingDINO/ms_deform_attn.py", line 338, in forward output = MultiScaleDeformableAttnFunction.apply( File "/opt/conda/lib/python3.10/site-packages/torch/autograd/function.py", line 598, in apply return super().apply(*args, **kwargs) # type: ignore[misc] File "/home/appuser/Grounded-SAM-2/grounding_dino/groundingdino/models/GroundingDINO/ms_deform_attn.py", line 53, in forward output = _C.ms_deform_attn_forward( NameError: name '_C' is not defined

TORCH_VERSION = ".".join(torch.version.split(".")[:2]) CUDA_VERSION = torch.version.split("+")[-1] print("torch: ", TORCH_VERSION, "; cuda: ", CUDA_VERSION) torch: 2.3 ; cuda: 2.3.1

print(torch.cuda.is_available()) True print(torch.cuda.device_count()) 1 print(torch.cuda.current_device()) 0 print(torch.cuda.device(0)) <torch.cuda.device object at 0x735d5cb464d0> print(torch.cuda.get_device_name(0)) NVIDIA GeForce RTX 2070 SUPER

Again, I am running the host machine with Ubuntu 24.04 with CUDA 12.2 if that is of any relevance.

This issue seems to be known as the Makefile references a similar issue (https://github.com/IDEA-Research/Grounded-Segment-Anything/issues/84) however the references suggests that the problem has been addressed.

Additionally I have tried and tested setting up Grounded-Segment-Anything and receive the same error, again on both devices. I was initially trying to set up Grounded-Segment-Anything and then tried setting up Grounded-SAM-2 as i thought the issue could have came from the version of cuda being used.