The code for batch_encode_plus in transformers claims to be working for both tuples and lists:
if not isinstance(batch_text_or_text_pairs, (tuple, list)):
raise TypeError(
f"batch_text_or_text_pairs has to be a list or a tuple (got {type(batch_text_or_text_pairs)})"
)
Certain fast tokenizers now fail on batches given as tuples, e.g. (on a MacBook M2 with transformers 4.46.1):
This works in v0.20.1. Presumably related to this PR: https://github.com/huggingface/tokenizers/pull/1665
The code for
batch_encode_plus
in transformers claims to be working for both tuples and lists: