huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
132k stars 26.29k forks source link

Type mis-match in function make_log_bucket_position() of TF DeBERTa V2 #31988

Open pinesnow72 opened 2 months ago

pinesnow72 commented 2 months ago

System Info

Who can help?

@ArthurZucker, @Rocketknight1

Information

Tasks

Reproduction

While I was trying to make TFDebertaV2Model work with mixed precision and checking the code in modeling_tf_deberta_v2.py, I found the below code block (in function make_log_bucket_position()) which may cause type mis-match error when executing.

from transformers.models.deberta_v2.modeling_tf_deberta_v2 import make_log_bucket_position
import tensorflow as tf

relative_pos = tf.constant([1,2,3,4], tf.int32)
bucket_size = tf.constant(5, tf.int32)
max_position = tf.constant(4, tf.int32)
make_log_bucket_position(relative_pos, bucket_size, max_position)

This code throws the following error message:

Traceback (most recent call last): File "", line 1, in File "/home/swlee/miniconda3/envs/tf216/lib/python3.12/site-packages/transformers/models/deberta_v2/modeling_tf_deberta_v2.py", line 551, in make_log_bucket_position tf.cast(tf.math.log(abs_pos / mid), tf.float32) / tf.math.log((max_position - 1) / mid) * (mid - 1)


  File "/home/swlee/miniconda3/envs/tf216/lib/python3.12/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/swlee/miniconda3/envs/tf216/lib/python3.12/site-packages/tensorflow/python/ops/math_ops.py", line 1412, in _truediv_python3
    raise TypeError(f"`x` and `y` must have the same dtype, "
TypeError: `x` and `y` must have the same dtype, got tf.float32 != tf.float64.

### Expected behavior

No TypeError and returns bucket_pos of type int32

[in modeling_tf_deberta_v2.py]

(lines: 550, 551, 552, 553 in function make_log_bucket_position()]
```
  tf.math.ceil(
      tf.cast(tf.math.log(abs_pos / mid), tf.float32) / tf.math.log((max_position - 1) / mid) * (mid - 1)
  )
  + mid
```
(correction would be)
```
  tf.math.ceil(
      tf.cast(tf.math.log(abs_pos / mid), tf.float32) / tf.cast(tf.math.log((max_position - 1) / mid), tf.float32) * tf.cast(mid - 1, tf.float32)  # in graph mode
  )
  + tf.cast(mid, tf.float32)
```
ArthurZucker commented 1 month ago

Hey! WOuld you like to open a PR for a fix, adding this as a test as well? 🤗

github-actions[bot] commented 1 day ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.