Finetune with A100 40G - Githubissues

tatsu-lab / stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

Apache License 2.0

29.39k stars 4.03k forks source link

Open jianchaoji opened 1 year ago

jianchaoji commented 1 year ago

Can we use A100 40G to finetune llama-7B? Is there anyone try that?

GasolSun36 commented 1 year ago

I try 8 A100 40g to finetune llama-7B with FSDP offload, it works fine for me.

jianchaoji commented 1 year ago

Thank you so much for the response! Did you try 4 A100 40G as well?

ffohturk commented 1 year ago

I tried 4 A100 40GB with FSDP offload, but had to reduce the eval and train batch size from 3 to 2 in order to avoid OOM. Took 58 hours.

hychaochao commented 8 months ago

I tried 4 A100 40GB with FSDP offload, but had to reduce the eval and train batch size from 3 to 2 in order to avoid OOM. Took 58 hours.

I tried the same configuration 4 A100 40G, but it still OOM. Can you publish your parameter settings? Thanks! @ffohturk