python finetune.py --data_path ./sample/merge_sample.json --test_size 9 训练报错

Facico / Chinese-Vicuna

Chinese-Vicuna: A Chinese Instruction-following LLaMA-based Model —— 一个中文低资源的llama+lora方案，结构参考alpaca

https://github.com/Facico/Chinese-Vicuna

Apache License 2.0

4.14k stars 421 forks source link

python finetune.py --data_path ./sample/merge_sample.json --test_size 9 训练报错 #141

Closed jackywei1228 closed 1 year ago

jackywei1228 commented 1 year ago

单卡训练脚本.

1、python finetune.py --data_path ./sample/merge_sample.json --test_size 9

训练报错 OutOfMemoryError: CUDA out of memory. Tried to allocate 44.00 MiB (GPU 0; 11.76 GiB total capacity; 9.73 GiB already allocated; 83.31 MiB free; 10.08 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

环境如下: 1、ubuntu 20.01 2、3060 12G 3、python 3.8 4、cuda 12.0

谢谢协助,纯小白一枚。

jackywei1228 commented 1 year ago

还是说3060 12G不行? 需要换2080ti 11G吗?

grantchenhuarong commented 1 year ago

试试减少fintune.py当中的参数 MICRO_BATCH_SIZE BATCH_SIZE

jackywei1228 commented 1 year ago

好的,谢谢.我试试.

jackywei1228 commented 1 year ago

貌似不行.... 这两个值都除以2了.. MICRO_BATCH_SIZE = 2 # this could actually be 5 but i like powers of 2 BATCH_SIZE = 64

输出结果: OutOfMemoryError: CUDA out of memory. Tried to allocate 44.00 MiB (GPU 0; 11.76 GiB total capacity; 9.77 GiB already allocated; 35.06 MiB free; 10.13 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Facico commented 1 year ago

@jackywei1228 和batch size没太大关系，你只用把MICRO_BATCH_SIZE 调低就可以了，最低可以调到1 然后你看一下你的CUTOFF_LEN 有没有改过，我们默认是256。2080Ti大概就占了10G不到的显存