Closed pavi-ninjaac closed 4 months ago
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Closing as #31109 was merged in
System Info
transformers.version # 4.5.1 !python -V # Python 3.8.10 Operating System: Ubuntu 20.04.6 LTS
I have faced the following issue:
kind of figured it out and made a PR for it : RP:31109 please help me through this and check the PR
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
from transformers import AutoTokenizer, TFAutoModelForSeq2SeqLM
model_checkpoint = "google/mt5-small"
model_checkpoint = "google/pegasus-cnn_dailymail"
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint) model = TFAutoModelForSeq2SeqLM.from_pretrained(model_checkpoint, from_pt=True) from transformers import DataCollatorForSeq2Seq
data_collator = DataCollatorForSeq2Seq(tokenizer, model=model, return_tensors="tf") features = [tokenized_data["train"][i] for i in range(2)] data_collator(features) tf_train_dataset = model.prepare_tf_dataset( tokenized_data["train"], collate_fn=data_collator, shuffle=True, batch_size=8, )
Expected behavior
I would expect it to give a dataset which could be used with tensorflow.