Closed wolfassi123 closed 8 months ago
Hey, can you share the exact traceback to debug this?
Hey, can you share the exact traceback to debug this?
Sure thing!
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
[<ipython-input-17-cb38b76c7066>](https://localhost:8080/#) in <cell line: 24>()
22 )
23
---> 24 trainer.train()
12 frames
[/usr/local/lib/python3.10/dist-packages/transformers/trainer.py](https://localhost:8080/#) in train(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs)
1553 hf_hub_utils.enable_progress_bars()
1554 else:
-> 1555 return inner_training_loop(
1556 args=args,
1557 resume_from_checkpoint=resume_from_checkpoint,
[/usr/local/lib/python3.10/dist-packages/transformers/trainer.py](https://localhost:8080/#) in _inner_training_loop(self, batch_size, args, resume_from_checkpoint, trial, ignore_keys_for_eval)
1858
1859 with self.accelerator.accumulate(model):
-> 1860 tr_loss_step = self.training_step(model, inputs)
1861
1862 if (
[/usr/local/lib/python3.10/dist-packages/transformers/trainer.py](https://localhost:8080/#) in training_step(self, model, inputs)
2723
2724 with self.compute_loss_context_manager():
-> 2725 loss = self.compute_loss(model, inputs)
2726
2727 if self.args.n_gpu > 1:
[/usr/local/lib/python3.10/dist-packages/transformers/trainer.py](https://localhost:8080/#) in compute_loss(self, model, inputs, return_outputs)
2746 else:
2747 labels = None
-> 2748 outputs = model(**inputs)
2749 # Save past state if it exists
2750 # TODO: this needs to be fixed and made cleaner later.
[/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _wrapped_call_impl(self, *args, **kwargs)
1516 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]
1517 else:
-> 1518 return self._call_impl(*args, **kwargs)
1519
1520 def _call_impl(self, *args, **kwargs):
[/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _call_impl(self, *args, **kwargs)
1525 or _global_backward_pre_hooks or _global_backward_hooks
1526 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1527 return forward_call(*args, **kwargs)
1528
1529 try:
[/usr/local/lib/python3.10/dist-packages/accelerate/utils/operations.py](https://localhost:8080/#) in forward(*args, **kwargs)
678
679 def forward(*args, **kwargs):
--> 680 return model_forward(*args, **kwargs)
681
682 # To act like a decorator so that it can be popped when doing `extract_model_from_parallel`
[/usr/local/lib/python3.10/dist-packages/accelerate/utils/operations.py](https://localhost:8080/#) in __call__(self, *args, **kwargs)
666
667 def __call__(self, *args, **kwargs):
--> 668 return convert_to_fp32(self.model_forward(*args, **kwargs))
669
670 def __getstate__(self):
[/usr/local/lib/python3.10/dist-packages/torch/amp/autocast_mode.py](https://localhost:8080/#) in decorate_autocast(*args, **kwargs)
14 def decorate_autocast(*args, **kwargs):
15 with autocast_instance:
---> 16 return func(*args, **kwargs)
17
18 decorate_autocast.__script_unsupported = "@autocast() decorator is not supported in script mode" # type: ignore[attr-defined]
[/usr/local/lib/python3.10/dist-packages/transformers/models/t5/modeling_t5.py](https://localhost:8080/#) in forward(self, input_ids, attention_mask, decoder_input_ids, decoder_attention_mask, head_mask, decoder_head_mask, cross_attn_head_mask, encoder_outputs, past_key_values, inputs_embeds, decoder_inputs_embeds, labels, use_cache, output_attentions, output_hidden_states, return_dict)
1707 if encoder_outputs is None:
1708 # Convert encoder inputs in embeddings if needed
-> 1709 encoder_outputs = self.encoder(
1710 input_ids=input_ids,
1711 attention_mask=attention_mask,
[/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _wrapped_call_impl(self, *args, **kwargs)
1516 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]
1517 else:
-> 1518 return self._call_impl(*args, **kwargs)
1519
1520 def _call_impl(self, *args, **kwargs):
[/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _call_impl(self, *args, **kwargs)
1525 or _global_backward_pre_hooks or _global_backward_hooks
1526 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1527 return forward_call(*args, **kwargs)
1528
1529 try:
[/usr/local/lib/python3.10/dist-packages/transformers/models/t5/modeling_t5.py](https://localhost:8080/#) in forward(self, input_ids, attention_mask, encoder_hidden_states, encoder_attention_mask, inputs_embeds, head_mask, cross_attn_head_mask, past_key_values, use_cache, output_attentions, output_hidden_states, return_dict)
1016 inputs_embeds = self.embed_tokens(input_ids)
1017
-> 1018 batch_size, seq_length = input_shape
1019
1020 # required mask seq length can be calculated via length of past
ValueError: too many values to unpack (expected 2)
I placed a breakpoint in your code, there is an issue with the inputs:
inputs["input_ids"].shape
torch.Size([16, 1, 512])
there is an extra dimension which probably comes from the way the dataset is processed / the data collator!
The following code fixed it:
def preprocess_function(examples):
combined_input = examples["Question"] + ": " + examples["true_contexts"]
model_inputs = tokenizer(combined_input, max_length=512, padding="max_length", truncation=True)
labels = tokenizer(text_target=examples["Rewrite"], max_length=512, padding="max_length", truncation=True)
model_inputs["labels"] = labels["input_ids"]
return model_inputs
Hey, can you share the exact traceback to debug this?
Sure thing!
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) [<ipython-input-17-cb38b76c7066>](https://localhost:8080/#) in <cell line: 24>() 22 ) 23 ---> 24 trainer.train() 12 frames [/usr/local/lib/python3.10/dist-packages/transformers/trainer.py](https://localhost:8080/#) in train(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs) 1553 hf_hub_utils.enable_progress_bars() 1554 else: -> 1555 return inner_training_loop( 1556 args=args, 1557 resume_from_checkpoint=resume_from_checkpoint, [/usr/local/lib/python3.10/dist-packages/transformers/trainer.py](https://localhost:8080/#) in _inner_training_loop(self, batch_size, args, resume_from_checkpoint, trial, ignore_keys_for_eval) 1858 1859 with self.accelerator.accumulate(model): -> 1860 tr_loss_step = self.training_step(model, inputs) 1861 1862 if ( [/usr/local/lib/python3.10/dist-packages/transformers/trainer.py](https://localhost:8080/#) in training_step(self, model, inputs) 2723 2724 with self.compute_loss_context_manager(): -> 2725 loss = self.compute_loss(model, inputs) 2726 2727 if self.args.n_gpu > 1: [/usr/local/lib/python3.10/dist-packages/transformers/trainer.py](https://localhost:8080/#) in compute_loss(self, model, inputs, return_outputs) 2746 else: 2747 labels = None -> 2748 outputs = model(**inputs) 2749 # Save past state if it exists 2750 # TODO: this needs to be fixed and made cleaner later. [/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _wrapped_call_impl(self, *args, **kwargs) 1516 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc] 1517 else: -> 1518 return self._call_impl(*args, **kwargs) 1519 1520 def _call_impl(self, *args, **kwargs): [/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _call_impl(self, *args, **kwargs) 1525 or _global_backward_pre_hooks or _global_backward_hooks 1526 or _global_forward_hooks or _global_forward_pre_hooks): -> 1527 return forward_call(*args, **kwargs) 1528 1529 try: [/usr/local/lib/python3.10/dist-packages/accelerate/utils/operations.py](https://localhost:8080/#) in forward(*args, **kwargs) 678 679 def forward(*args, **kwargs): --> 680 return model_forward(*args, **kwargs) 681 682 # To act like a decorator so that it can be popped when doing `extract_model_from_parallel` [/usr/local/lib/python3.10/dist-packages/accelerate/utils/operations.py](https://localhost:8080/#) in __call__(self, *args, **kwargs) 666 667 def __call__(self, *args, **kwargs): --> 668 return convert_to_fp32(self.model_forward(*args, **kwargs)) 669 670 def __getstate__(self): [/usr/local/lib/python3.10/dist-packages/torch/amp/autocast_mode.py](https://localhost:8080/#) in decorate_autocast(*args, **kwargs) 14 def decorate_autocast(*args, **kwargs): 15 with autocast_instance: ---> 16 return func(*args, **kwargs) 17 18 decorate_autocast.__script_unsupported = "@autocast() decorator is not supported in script mode" # type: ignore[attr-defined] [/usr/local/lib/python3.10/dist-packages/transformers/models/t5/modeling_t5.py](https://localhost:8080/#) in forward(self, input_ids, attention_mask, decoder_input_ids, decoder_attention_mask, head_mask, decoder_head_mask, cross_attn_head_mask, encoder_outputs, past_key_values, inputs_embeds, decoder_inputs_embeds, labels, use_cache, output_attentions, output_hidden_states, return_dict) 1707 if encoder_outputs is None: 1708 # Convert encoder inputs in embeddings if needed -> 1709 encoder_outputs = self.encoder( 1710 input_ids=input_ids, 1711 attention_mask=attention_mask, [/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _wrapped_call_impl(self, *args, **kwargs) 1516 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc] 1517 else: -> 1518 return self._call_impl(*args, **kwargs) 1519 1520 def _call_impl(self, *args, **kwargs): [/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _call_impl(self, *args, **kwargs) 1525 or _global_backward_pre_hooks or _global_backward_hooks 1526 or _global_forward_hooks or _global_forward_pre_hooks): -> 1527 return forward_call(*args, **kwargs) 1528 1529 try: [/usr/local/lib/python3.10/dist-packages/transformers/models/t5/modeling_t5.py](https://localhost:8080/#) in forward(self, input_ids, attention_mask, encoder_hidden_states, encoder_attention_mask, inputs_embeds, head_mask, cross_attn_head_mask, past_key_values, use_cache, output_attentions, output_hidden_states, return_dict) 1016 inputs_embeds = self.embed_tokens(input_ids) 1017 -> 1018 batch_size, seq_length = input_shape 1019 1020 # required mask seq length can be calculated via length of past ValueError: too many values to unpack (expected 2)
Hi Wolfassi, actually I'm writing to talk to you about your work in Arabic OCR. I've been trying to do some Arabic OCR but not I can only get about 95% accuracy rate. Have you been able to do any better than that and if so, how?
Hey, can you share the exact traceback to debug this?
Sure thing!
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) [<ipython-input-17-cb38b76c7066>](https://localhost:8080/#) in <cell line: 24>() 22 ) 23 ---> 24 trainer.train() 12 frames [/usr/local/lib/python3.10/dist-packages/transformers/trainer.py](https://localhost:8080/#) in train(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs) 1553 hf_hub_utils.enable_progress_bars() 1554 else: -> 1555 return inner_training_loop( 1556 args=args, 1557 resume_from_checkpoint=resume_from_checkpoint, [/usr/local/lib/python3.10/dist-packages/transformers/trainer.py](https://localhost:8080/#) in _inner_training_loop(self, batch_size, args, resume_from_checkpoint, trial, ignore_keys_for_eval) 1858 1859 with self.accelerator.accumulate(model): -> 1860 tr_loss_step = self.training_step(model, inputs) 1861 1862 if ( [/usr/local/lib/python3.10/dist-packages/transformers/trainer.py](https://localhost:8080/#) in training_step(self, model, inputs) 2723 2724 with self.compute_loss_context_manager(): -> 2725 loss = self.compute_loss(model, inputs) 2726 2727 if self.args.n_gpu > 1: [/usr/local/lib/python3.10/dist-packages/transformers/trainer.py](https://localhost:8080/#) in compute_loss(self, model, inputs, return_outputs) 2746 else: 2747 labels = None -> 2748 outputs = model(**inputs) 2749 # Save past state if it exists 2750 # TODO: this needs to be fixed and made cleaner later. [/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _wrapped_call_impl(self, *args, **kwargs) 1516 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc] 1517 else: -> 1518 return self._call_impl(*args, **kwargs) 1519 1520 def _call_impl(self, *args, **kwargs): [/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _call_impl(self, *args, **kwargs) 1525 or _global_backward_pre_hooks or _global_backward_hooks 1526 or _global_forward_hooks or _global_forward_pre_hooks): -> 1527 return forward_call(*args, **kwargs) 1528 1529 try: [/usr/local/lib/python3.10/dist-packages/accelerate/utils/operations.py](https://localhost:8080/#) in forward(*args, **kwargs) 678 679 def forward(*args, **kwargs): --> 680 return model_forward(*args, **kwargs) 681 682 # To act like a decorator so that it can be popped when doing `extract_model_from_parallel` [/usr/local/lib/python3.10/dist-packages/accelerate/utils/operations.py](https://localhost:8080/#) in __call__(self, *args, **kwargs) 666 667 def __call__(self, *args, **kwargs): --> 668 return convert_to_fp32(self.model_forward(*args, **kwargs)) 669 670 def __getstate__(self): [/usr/local/lib/python3.10/dist-packages/torch/amp/autocast_mode.py](https://localhost:8080/#) in decorate_autocast(*args, **kwargs) 14 def decorate_autocast(*args, **kwargs): 15 with autocast_instance: ---> 16 return func(*args, **kwargs) 17 18 decorate_autocast.__script_unsupported = "@autocast() decorator is not supported in script mode" # type: ignore[attr-defined] [/usr/local/lib/python3.10/dist-packages/transformers/models/t5/modeling_t5.py](https://localhost:8080/#) in forward(self, input_ids, attention_mask, decoder_input_ids, decoder_attention_mask, head_mask, decoder_head_mask, cross_attn_head_mask, encoder_outputs, past_key_values, inputs_embeds, decoder_inputs_embeds, labels, use_cache, output_attentions, output_hidden_states, return_dict) 1707 if encoder_outputs is None: 1708 # Convert encoder inputs in embeddings if needed -> 1709 encoder_outputs = self.encoder( 1710 input_ids=input_ids, 1711 attention_mask=attention_mask, [/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _wrapped_call_impl(self, *args, **kwargs) 1516 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc] 1517 else: -> 1518 return self._call_impl(*args, **kwargs) 1519 1520 def _call_impl(self, *args, **kwargs): [/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _call_impl(self, *args, **kwargs) 1525 or _global_backward_pre_hooks or _global_backward_hooks 1526 or _global_forward_hooks or _global_forward_pre_hooks): -> 1527 return forward_call(*args, **kwargs) 1528 1529 try: [/usr/local/lib/python3.10/dist-packages/transformers/models/t5/modeling_t5.py](https://localhost:8080/#) in forward(self, input_ids, attention_mask, encoder_hidden_states, encoder_attention_mask, inputs_embeds, head_mask, cross_attn_head_mask, past_key_values, use_cache, output_attentions, output_hidden_states, return_dict) 1016 inputs_embeds = self.embed_tokens(input_ids) 1017 -> 1018 batch_size, seq_length = input_shape 1019 1020 # required mask seq length can be calculated via length of past ValueError: too many values to unpack (expected 2)
Hi Wolfassi, actually I'm writing to talk to you about your work in Arabic OCR. I've been trying to do some Arabic OCR but not I can only get about 95% accuracy rate. Have you been able to do any better than that and if so, how?
Hello there. Yes I have previously worked on Arabic OCR and no to be honest I did not achieve that high of an accuracy. I believe an accuracy of 95% is just too high to target. I tested using both EasyOCR and Tesseract. Tesseract seemed to perform the best after you finetune the model for the specific font you are using. I would also suggest trying out Paddle Paddle.
Hi @wolfassi123 I am facing the below error could you please help while fine tuning.
code:
from transformers import Trainer, TrainingArguments
training_args = TrainingArguments(
output_dir="./roberta-retrained",
overwrite_output_dir=True,
num_train_epochs=25,
per_device_train_batch_size=48,
save_total_limit=2,
)
# Initialize the Trainer
trainer = Trainer(
model=cyberspecificmodel,
args=training_args,
data_collator=data_collator,
train_dataset=dataset,
)
# Train the model
trainer.train()
# Save the fine-tuned model
trainer.save_model("./roberta_cybersecurity")
**ERROR**
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
[<ipython-input-8-a5adcccd61c2>](https://localhost:8080/#) in <cell line: 12>()
10
11 # Initialize the Trainer
---> 12 trainer = Trainer(
13 model=cyberspecificmodel,
14 args=training_args,
1 frames
[/usr/local/lib/python3.10/dist-packages/transformers/trainer.py](https://localhost:8080/#) in create_accelerator_and_postprocess(self)
4125 accelerator_kwargs = AcceleratorConfig(**accelerator_kwargs).to_dict()
4126
-> 4127 self.accelerator = Accelerator(
4128 deepspeed_plugin=self.args.deepspeed_plugin,
4129 gradient_accumulation_plugin=gradient_accumulation_plugin,
TypeError: Accelerator.__init__() got an unexpected keyword argument 'use_seedable_sampler'
@Sauce16 this seems like an unrelated issue, open a new issue with the reproducer if you want help and make sure this is still failing with latest versions!
System Info
transformers
version: 4.35.2Who can help?
@ArthurZucker @youne
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
I am trying to fine tune a T5-base model but have been facing issues despite following the step by step guide found on the huggingface hub here.
So far this is my code:
transformers.logging.set_verbosity_error()
I tried several examples including my own Customized Class for the trainer function but always ended with the same issue even when I tried the same code found in the step-by-step guide provided by huggingface.
The error happens when calling the
trainer.train()
returning the following:ValueError: too many values to unpack (expected 2)
I followed the exact same format as the documentation and I believe it is something that is happening when calling the loss function but was just unable to put my finger to it, if anyone can help that would be great.
Expected behavior
Expected behavior is trying being able to fine-tune the T5 model with the above dataset by eliminating or identifying the cause of the error.