Type mis-match in function make_log_bucket_position() of TF DeBERTa V2

pinesnow72 commented 2 months ago

System Info

transformers version: 4.41.2
Platform: Linux-5.15.0-107-generic-x86_64-with-glibc2.35
Python version: 3.12.3
Huggingface_hub version: 0.23.2
Safetensors version: 0.4.3
Accelerate version: not installed
Accelerate config: not found
PyTorch version (GPU): 2.3.0+cu121 (True)
Tensorflow version (GPU): 2.16.1 (True)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?: Yes
Using distributed or parallel set-up in script?: Yes

Who can help?

@ArthurZucker, @Rocketknight1

Information

[ ] The official example scripts
[ ] My own modified scripts

Tasks

[ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Reproduction

While I was trying to make TFDebertaV2Model work with mixed precision and checking the code in modeling_tf_deberta_v2.py, I found the below code block (in function make_log_bucket_position()) which may cause type mis-match error when executing.

from transformers.models.deberta_v2.modeling_tf_deberta_v2 import make_log_bucket_position
import tensorflow as tf

relative_pos = tf.constant([1,2,3,4], tf.int32)
bucket_size = tf.constant(5, tf.int32)
max_position = tf.constant(4, tf.int32)
make_log_bucket_position(relative_pos, bucket_size, max_position)

This code throws the following error message:

Traceback (most recent call last): File "", line 1, in File "/home/swlee/miniconda3/envs/tf216/lib/python3.12/site-packages/transformers/models/deberta_v2/modeling_tf_deberta_v2.py", line 551, in make_log_bucket_position tf.cast(tf.math.log(abs_pos / mid), tf.float32) / tf.math.log((max_position - 1) / mid) * (mid - 1)


  File "/home/swlee/miniconda3/envs/tf216/lib/python3.12/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/swlee/miniconda3/envs/tf216/lib/python3.12/site-packages/tensorflow/python/ops/math_ops.py", line 1412, in _truediv_python3
    raise TypeError(f"`x` and `y` must have the same dtype, "
TypeError: `x` and `y` must have the same dtype, got tf.float32 != tf.float64.

### Expected behavior

No TypeError and returns bucket_pos of type int32

[in modeling_tf_deberta_v2.py]

(lines: 550, 551, 552, 553 in function make_log_bucket_position()]
```
  tf.math.ceil(
      tf.cast(tf.math.log(abs_pos / mid), tf.float32) / tf.math.log((max_position - 1) / mid) * (mid - 1)
  )
  + mid
```
(correction would be)
```
  tf.math.ceil(
      tf.cast(tf.math.log(abs_pos / mid), tf.float32) / tf.cast(tf.math.log((max_position - 1) / mid), tf.float32) * tf.cast(mid - 1, tf.float32)  # in graph mode
  )
  + tf.cast(mid, tf.float32)
```

ArthurZucker commented 1 month ago

Hey! WOuld you like to open a PR for a fix, adding this as a test as well? 🤗

github-actions[bot] commented 1 day ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

huggingface / transformers