Closed hannan72 closed 1 month ago
Any updates @sanchit-gandhi @ArthurZucker ?
@eustlb, would you mind looking at this issue if you have some bandwidth?
I think the error is due to the issue of checking jax arrays with not
in tokenization_whisper.py
code:
https://github.com/huggingface/transformers/blob/d1f39c484d8347aa7b3170ea250a1e8f3bdfdf31/src/transformers/models/whisper/tokenization_whisper.py#L852
It is OK to check token_ids
if it is torch or np, but for the cases that it is a JAX array, it is not possible to directly use a JAX array in a boolean context (e.g., if not jax_array:) so jax raises error for such cases:
File "/usr/local/lib/python3.10/dist-packages/jax/_src/array.py", line 258, in __bool__
core.check_bool_conversion(self)
File "/usr/local/lib/python3.10/dist-packages/jax/_src/core.py", line 654, in check_bool_conversion
raise ValueError("The truth value of an array with more than one element"
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
feel free to open a PR for a fix then!
I created a PR: https://github.com/huggingface/transformers/pull/33151
@ArthurZucker Please review and merge it
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Closing as it was merged!
System Info
transformers
version: 4.43.0Who can help?
@sanchit-gandhi
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
I use this piece of code to deploy a sample audio file on Flax Whisper-large-v3 model with Jax.
It was working properly until version 4.42.4 of transformers, but from version 4.43.0 of transformers, it raises an error in the last line of the code (batch_decode):
However in the
batch_decode
method, if I disable theskip_special_tokens
arg (set it toFalse
), it raises no error but return lots of special chars.Expected behavior
It is expected to return list of strings in the result of
batch_decode
method, as same as how it works until version 4.42.4 of transformers