YuanGongND / ltu

Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".
389 stars 36 forks source link

Question about Finetune exp #27

Open doubleHon opened 7 months ago

doubleHon commented 7 months ago

File "/transformers/tokenization_utils_base.py", line 708, in as_tensor return torch.tensor(value) ValueError: too many dimensions 'str' ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (audio_id in this case) have excessive nesting (inputs type list where type int is expected).

Isn’t the json file read and converted into the corresponding tensor? Why do I find that it has not been converted during debugging? The error is reported when running ./finetune_toy.sh. I don’t know how to solve it.

YuanGongND commented 7 months ago

The error is reported when running ./finetune_toy.sh.

Did you change anything of our script (including the data downloading part)?

doubleHon commented 7 months ago

I seem to know where the problem is. Thanks for your reminder! Is there any specific difference between "finetune_toy.sh" and "finetune_toy_low_resource.sh?" I can now run "finetune_toy_low_resource", but for "finetune_toy" it is always out of memory. We have two 40g A100s and four 24g 3090s.

YuanGongND commented 7 months ago

low_resource splits the model and place the parts into different GPUs (model parallelism), it helps if you have multiple small GPUs a the same machine. It is slower than the other script. The toy one place an entire model on each GPU.

We mentioned the training device requirement in the readme file. I believe 40g and 24g GPUs need to use low_resource script with our recipe. But it is possible to to run the original one with techs like 8bit training etc.

doubleHon commented 7 months ago

Thanks for your reply and good luck with your work!