ylacombe / finetune-hf-vits

Finetune VITS and MMS using HuggingFace's tools
MIT License
122 stars 28 forks source link

BatchEncoding.to() non_blocking error #28

Closed isolveit-aps closed 7 months ago

isolveit-aps commented 7 months ago

Forgive me if this is a wrong place to inquire about this issue, but what would be the reason for this error? I prepared a dataset for faroese, and am trying to start the finetuning by running the command:

accelerate launch run_vits_finetuning.py ./training_config_examples/finetune_mms_fao.json

It seems to start and I get a few confirmation messages:

[INFO|modeling_utils.py:3280] 2024-04-03 13:32:51,747 >> loading weights file /tmp/tmp2gq66xpi/model.safetensors
[INFO|modeling_utils.py:4024] 2024-04-03 13:32:51,786 >> All model checkpoint weights were used when initializing VitsDiscriminator.

[INFO|modeling_utils.py:4032] 2024-04-03 13:32:51,787 >> All the weights of VitsDiscriminator were initialized from the model checkpoint at /tmp/tmp2gq66xpi.
If your task is similar to the task the model of the checkpoint was trained on, you can already use VitsDiscriminator for predictions without further training.

But after a few seconds, I get this error message:

04/03/2024 13:32:53 - INFO - __main__ - ***** Running training *****
04/03/2024 13:32:53 - INFO - __main__ -   Num examples = 1075
04/03/2024 13:32:53 - INFO - __main__ -   Num Epochs = 200
04/03/2024 13:32:53 - INFO - __main__ -   Instantaneous batch size per device = 16
04/03/2024 13:32:53 - INFO - __main__ -   Total train batch size (w. parallel, distributed & accumulation) = 16
04/03/2024 13:32:53 - INFO - __main__ -   Gradient Accumulation steps = 1
04/03/2024 13:32:53 - INFO - __main__ -   Total optimization steps = 13600
Steps:   0%|                                                                                                                                                                                                                      | 0/13600 [00:00<?, ?it/s]Traceback (most recent call last):
  File "/home/myuser/.local/lib/python3.10/site-packages/accelerate/utils/operations.py", line 155, in send_to_device
    return tensor.to(device, non_blocking=non_blocking)
TypeError: BatchEncoding.to() got an unexpected keyword argument 'non_blocking'

During handling of the above exception, another exception occurred:


Traceback (most recent call last):
  File "/home/myuser/finetune-hf-vits/run_vits_finetuning.py", line 1494, in <module>
    main()
  File "/home/myuser/finetune-hf-vits/run_vits_finetuning.py", line 1090, in main
    for step, batch in enumerate(train_dataloader):
  File "/home/myuser/.local/lib/python3.10/site-packages/accelerate/data_loader.py", line 461, in __iter__
    current_batch = send_to_device(current_batch, self.device)
  File "/home/myuser/.local/lib/python3.10/site-packages/accelerate/utils/operations.py", line 157, in send_to_device
    return tensor.to(device)
  File "/home/myuser/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 800, in to
    self.data = {k: v.to(device=device) for k, v in self.data.items()}
  File "/home/myuser/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 800, in <dictcomp>
    self.data = {k: v.to(device=device) for k, v in self.data.items()}
AttributeError: 'NoneType' object has no attribute 'to'
Traceback (most recent call last):
  File "/home/myuser/.local/lib/python3.10/site-packages/accelerate/utils/operations.py", line 155, in send_to_device
    return tensor.to(device, non_blocking=non_blocking)
TypeError: BatchEncoding.to() got an unexpected keyword argument 'non_blocking'

Any clue on what the reason is for this error message?

isolveit-aps commented 7 months ago

Sorry, this was a duplicate of issue #22 - the solution in that issue works for me as well.