YuanGongND / ltu

Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".
334 stars 26 forks source link

Running Issue about Low-Resource Training for LTU-AS #20

Open dingdongwang opened 5 months ago

dingdongwang commented 5 months ago

Hi, I have encountered the error when I run the stage1_proj_cla.sh, both the base_model and data_path are keep the same, and I also change the script to finetune_low_resource.py with smaller bs (The other parameters have not changed). Still encontered error about CUDA out of memery. The GPU I used are RTX3090 * 4, which have the same VRAM as A5000. May I kindly ask do you know the reason for that?

image

Thank you and looking forward to your reply!

YuanGongND commented 5 months ago

please first run this with our provided data (please follow our instruction in toy finetuning) https://github.com/YuanGongND/ltu/blob/main/src/ltu_as/train_scripts/finetune_toy_low_resource.sh

how many vram 3090 has?

dingdongwang commented 4 months ago

Thanks for your reply! The data I used is the provided toy data. And the rvam of 3090 is 24G.

YuanGongND commented 4 months ago

24GB*4 needs use low-resource code.

Note this change:

Original:

https://github.com/YuanGongND/ltu/blob/8c8f92446a8121fc78d2f7dece2a6e08dc2061b2/src/ltu/train_script/finetune_toy.sh#L18

Low resource:

https://github.com/YuanGongND/ltu/blob/8c8f92446a8121fc78d2f7dece2a6e08dc2061b2/src/ltu/train_script/finetune_toy_low_resource.sh#L21

In general, if you can run low-resource toy, with same change, you can run low-resource real train.

dingdongwang commented 4 months ago

Bug fixed, thank you!

ErikIsMel commented 3 months ago

I also encountered this problem, how to solve it? (the finetune_toy.sh file)

peggyxpxu commented 1 month ago

hi, sir

24GB*4 needs use low-resource code.

Note this change:

Original:

https://github.com/YuanGongND/ltu/blob/8c8f92446a8121fc78d2f7dece2a6e08dc2061b2/src/ltu/train_script/finetune_toy.sh#L18

Low resource:

https://github.com/YuanGongND/ltu/blob/8c8f92446a8121fc78d2f7dece2a6e08dc2061b2/src/ltu/train_script/finetune_toy_low_resource.sh#L21

In general, if you can run low-resource toy, with same change, you can run low-resource real train.

HI, I used one 32G V100 for low-resource toy, but 'cuda out of memory' even I change batch size to 1.

YuanGongND commented 1 month ago

Yes, that is as expected. You need either 1X 48G GPU; or 4 X 24G GPUs (we used 4 X 48G). A single 32G GPU needs some additional work to lower the computational cost, e.g., 8bit training (warning: if you decide to use 8bit training, it would be much better to start with our pretrained audio model rather than start from scratch).

-Yuan

peggyxpxu commented 1 month ago

Yes, that is as expected. You need either 1X 48G GPU; or 4 X 24G GPUs (we used 4 X 48G). A single 32G GPU needs some additional work to lower the computational cost, e.g., 8bit training (warning: if you decide to use 8bit training, it would be much better to start with our pretrained audio model rather than start from scratch).

-Yuan

I use 2 X 32G GPU and fix the bug, thanks. I have another question, V100 can't use BF16, so I use FP16 instead. Will this have any negative impact?