Open doubleHon opened 7 months ago
The error is reported when running ./finetune_toy.sh.
Did you change anything of our script (including the data downloading part)?
I seem to know where the problem is. Thanks for your reminder! Is there any specific difference between "finetune_toy.sh" and "finetune_toy_low_resource.sh?" I can now run "finetune_toy_low_resource", but for "finetune_toy" it is always out of memory. We have two 40g A100s and four 24g 3090s.
low_resource
splits the model and place the parts into different GPUs (model parallelism), it helps if you have multiple small GPUs a the same machine. It is slower than the other script. The toy
one place an entire model on each GPU.
We mentioned the training device requirement in the readme file. I believe 40g and 24g GPUs need to use low_resource
script with our recipe. But it is possible to to run the original one with techs like 8bit training etc.
Thanks for your reply and good luck with your work!
File "/transformers/tokenization_utils_base.py", line 708, in as_tensor return torch.tensor(value) ValueError: too many dimensions 'str' ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (
audio_id
in this case) have excessive nesting (inputs typelist
where typeint
is expected).Isn’t the json file read and converted into the corresponding tensor? Why do I find that it has not been converted during debugging? The error is reported when running ./finetune_toy.sh. I don’t know how to solve it.