huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
132.01k stars 26.29k forks source link

fp16 support for grounding dino #32672

Open Benjamin-Tan opened 1 month ago

Benjamin-Tan commented 1 month ago

Feature request

Currently, if fp16 is used with grounding dino via https://huggingface.co/docs/transformers/main/en/model_doc/grounding-dino, there is an error of the following:

...
  File "/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/user/.local/lib/python3.10/site-packages/transformers/models/grounding_dino/modeling_grounding_dino.py", line 3023, in forward
    outputs = self.model(
  File "/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/user/.local/lib/python3.10/site-packages/transformers/models/grounding_dino/modeling_grounding_dino.py", line 2360, in forward
    encoder_outputs = self.encoder(
  File "/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/user/.local/lib/python3.10/site-packages/transformers/models/grounding_dino/modeling_grounding_dino.py", line 1753, in forward
    (vision_features, text_features), attentions = encoder_layer(
  File "/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/user/.local/lib/python3.10/site-packages/transformers/models/grounding_dino/modeling_grounding_dino.py", line 1274, in forward
    (text_features, text_enhanced_attn) = self.text_enhancer_layer(
  File "/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/user/.local/lib/python3.10/site-packages/transformers/models/grounding_dino/modeling_grounding_dino.py", line 828, in forward
    attention_output, attention_weights = self.self_attn(
  File "/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/user/.local/lib/python3.10/site-packages/transformers/models/grounding_dino/modeling_grounding_dino.py", line 1331, in forward
    query_layer = self.transpose_for_scores(self.query(queries))
  File "/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 117, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 must have the same dtype, but got Float and Half

Motivation

It would be good to add support for fp16 to speed up the inference time.

Your contribution

Happy to contribute if this feature is deemed to be useful.

molbap commented 1 month ago

Hi @Benjamin-Tan, thanks for the issue! Indeed, I ran a quick check and confirm bf16 and f16 aren't working right now. It would definitely be nice to have, feel free to open a PR if you want to contribute to this!