how to finetune deepseek-coder-33b-instruct with 8*A800 80G

deepseek-ai / DeepSeek-Coder

DeepSeek Coder: Let the Code Write Itself

https://coder.deepseek.com/

MIT License

6.01k stars 433 forks source link

how to finetune deepseek-coder-33b-instruct with 8*A800 80G #78

Closed netrookiecn closed 6 months ago

netrookiecn commented 6 months ago

how can full parameters should be finetuned rather than lora? when using deepspeed zero stage3 , out of memory exists

netrookiecn commented 6 months ago

try cpu offload if u have 1T RAM

i tried zero3 and i have 1.2T memory.set batch_size 1. but it fails again, do you have any suggestions?

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 3.50 GiB (GPU 3; 79.35 GiB total capacity; 77.08 GiB already allocated; 415.19 MiB free; 77.54 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

@BeiQingLu1113

netrookiecn commented 6 months ago

try cpu offload if u have 1T RAM

i tried zero3 and i have 1.2T memory.set batch_size 1. but it fails again, do you have any suggestions?

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 3.50 GiB (GPU 3; 79.35 GiB total capacity; 77.08 GiB already allocated; 415.19 MiB free; 77.54 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

@BeiQingLu1113

solved by changing some params

jaywongs commented 6 months ago

try cpu offload if u have 1T RAM

i tried zero3 and i have 1.2T memory.set batch_size 1. but it fails again, do you have any suggestions? torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 3.50 GiB (GPU 3; 79.35 GiB total capacity; 77.08 GiB already allocated; 415.19 MiB free; 77.54 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF @BeiQingLu1113

solved by changing some params

Could you please paste your parameters here? I have encountered the same problem. Thank you!

netrookiecn commented 6 months ago

try cpu offload if u have 1T RAM

i tried zero3 and i have 1.2T memory.set batch_size 1. but it fails again, do you have any suggestions? torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 3.50 GiB (GPU 3; 79.35 GiB total capacity; 77.08 GiB already allocated; 415.19 MiB free; 77.54 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF @BeiQingLu1113

solved by changing some params

Could you please paste your parameters here? I have encountered the same problem. Thank you!

use 1T memmory and deepspeed zero3 offload , with micro batchsize=1 and max length <2048 , then try to experiement