Open artificialzjy opened 8 months ago
getting a similar error, I believe it has to do with the cuda version, but not sure how to solve it. Running CUDA 12-2
I upgraded the torch version to 2.2.1, which resolved the issue. I hope this solution is beneficial to you all.
I am getting the same error. I have installed using docker within an Ubuntu 20.04 WSL2 distribution.
nvcc --version
returns release 11.6 (CUDA version)
This error is resulting due to a problem loading the custom C++ operations required by the GroundingDINO model. The warning message "Failed to load custom C++ ops. Running on CPU mode Only!" suggests that the necessary compiled C++ operations were not found or could not be loaded.
There are some other requirements and modules needed by GroundingDINO. These can be installed using the requirements.txt and setup.py files inside of the GroundingDINO directory. Navigate to it and run:
pip install -r requirements.txt
and
python setup.py install
All necessary prerequisites should be now installed, navigate back to the parents directory and try running the demo again.
As a quick fix (or another solution that goes into a different direction) that I had to come up with because I had to make it work in an environment where I could not install CUDA tools, you can change the GroundingDINO ms_deform_attention.py
code to use multi_scale_deformable_attn_pytorch
.
The Grounding DINO code should work fine after that, but I have to admit that I did not test it thoroughly and that I am not sure whether the PyTorch implementation has any disadvantages over their CUDA build implementation in terms of runtime on a GPU. To be more precise:
Grounded-Segment-Anything/GroundingDINO/groundingdino/models/GroundingDINO/ms_deform_attn.py
, delete the code on lines 28 to 30, i.e.,
try:
from groundingdino import _C
except:
warnings.warn("Failed to load custom C++ ops. Running on CPU mode Only!")
Delete the MultiScaleDeformableAttnFunction implementation on lines 41 to 90 i.e.
lass MultiScaleDeformableAttnFunction(Function):
@staticmethod
def forward(
ctx,
value,
value_spatial_shapes,
value_level_start_index,
sampling_locations,
attention_weights,
im2col_step,
):
ctx.im2col_step = im2col_step
output = _C.ms_deform_attn_forward(
value,
value_spatial_shapes,
value_level_start_index,
sampling_locations,
attention_weights,
ctx.im2col_step,
)
ctx.save_for_backward(
value,
value_spatial_shapes,
value_level_start_index,
sampling_locations,
attention_weights,
)
return output
@staticmethod
@once_differentiable
def backward(ctx, grad_output):
(
value,
value_spatial_shapes,
value_level_start_index,
sampling_locations,
attention_weights,
) = ctx.saved_tensors
grad_value, grad_sampling_loc, grad_attn_weight = _C.ms_deform_attn_backward(
value,
value_spatial_shapes,
value_level_start_index,
sampling_locations,
attention_weights,
grad_output,
ctx.im2col_step,
)
return grad_value, None, None, grad_sampling_loc, grad_attn_weight, None
Change the forward function of the MultiScaleDeformableAttention Module to use the pytorch implementation i.e. change on line 329 to 352 from
if torch.cuda.is_available() and value.is_cuda:
halffloat = False
if value.dtype == torch.float16:
halffloat = True
value = value.float()
sampling_locations = sampling_locations.float()
attention_weights = attention_weights.float()
output = MultiScaleDeformableAttnFunction.apply(
value,
spatial_shapes,
level_start_index,
sampling_locations,
attention_weights,
self.im2col_step,
)
if halffloat:
output = output.half()
else:
output = multi_scale_deformable_attn_pytorch(
value, spatial_shapes, sampling_locations, attention_weights
)
to
output = multi_scale_deformable_attn_pytorch(
value, spatial_shapes, sampling_locations, attention_weights
)
Note that you probably need to run pip uninstall and reinstall the library after that.
As a quick fix (or another solution that goes into a different direction) that I had to come up with because I had to make it work in an environment where I could not install CUDA tools, you can change the GroundingDINO
ms_deform_attention.py
code to usemulti_scale_deformable_attn_pytorch
.The Grounding DINO code should work fine after that, but I have to admit that I did not test it thoroughly and that I am not sure whether the PyTorch implementation has any disadvantages over their CUDA build implementation in terms of runtime on a GPU. To be more precise:
- In
Grounded-Segment-Anything/GroundingDINO/groundingdino/models/GroundingDINO/ms_deform_attn.py
, delete the code on lines 28 to 30, i.e.,try: from groundingdino import _C except: warnings.warn("Failed to load custom C++ ops. Running on CPU mode Only!")
- Delete the MultiScaleDeformableAttnFunction implementation on lines 41 to 90 i.e.
lass MultiScaleDeformableAttnFunction(Function): @staticmethod def forward( ctx, value, value_spatial_shapes, value_level_start_index, sampling_locations, attention_weights, im2col_step, ): ctx.im2col_step = im2col_step output = _C.ms_deform_attn_forward( value, value_spatial_shapes, value_level_start_index, sampling_locations, attention_weights, ctx.im2col_step, ) ctx.save_for_backward( value, value_spatial_shapes, value_level_start_index, sampling_locations, attention_weights, ) return output @staticmethod @once_differentiable def backward(ctx, grad_output): ( value, value_spatial_shapes, value_level_start_index, sampling_locations, attention_weights, ) = ctx.saved_tensors grad_value, grad_sampling_loc, grad_attn_weight = _C.ms_deform_attn_backward( value, value_spatial_shapes, value_level_start_index, sampling_locations, attention_weights, grad_output, ctx.im2col_step, ) return grad_value, None, None, grad_sampling_loc, grad_attn_weight, None
- Change the forward function of the MultiScaleDeformableAttention Module to use the pytorch implementation i.e. change on line 329 to 352 from
if torch.cuda.is_available() and value.is_cuda: halffloat = False if value.dtype == torch.float16: halffloat = True value = value.float() sampling_locations = sampling_locations.float() attention_weights = attention_weights.float() output = MultiScaleDeformableAttnFunction.apply( value, spatial_shapes, level_start_index, sampling_locations, attention_weights, self.im2col_step, ) if halffloat: output = output.half() else: output = multi_scale_deformable_attn_pytorch( value, spatial_shapes, sampling_locations, attention_weights )
to
output = multi_scale_deformable_attn_pytorch( value, spatial_shapes, sampling_locations, attention_weights )
Note that you probably need to run pip uninstall and reinstall the library after that.
It works for me. Thanks!
the following approach works for me
cd ./Grounded-Segment-Anything/GroundingDINO/ &&\
python setup.py build &&\
python setup.py install
I followed the instruction of the readme,but when i try the first demo,i got the error: Failed to load custom C++ ops. Running on CPU mode Only! and NameError: name '_C' is not defined.
I install the torch,torchaudio,torchvision with pip.