meta-llama / llama3

The official Meta Llama 3 GitHub site
Other
21.42k stars 2.14k forks source link

Meta-Llama-3-70B-Instruct running out of memory on 8 A100-40GB #183

Open whatdhack opened 1 month ago

whatdhack commented 1 month ago

Describe the bug

Out of memory. Tried to allocate X.XX GiB .....

Minimal reproducible example

I guess any A100 system with 8+ GPUs

python example_chat_completion.py

Output

<Remember to wrap the output in ```triple-quotes blocks```>

Out of memory. Tried to allocate X.XX GiB  .....

Runtime Environment

Additional context Is there a way to reduce the memory requirement ? Most obvious trick, reducing batch size, did not prevent OOM.

whatdhack commented 3 weeks ago

What is the best way to adapt the 8 checkpoints for A100-80GB/H100 for the 70B model to say 16 A100-40GB ?

subramen commented 2 weeks ago

Please see this thread: https://github.com/meta-llama/llama3/issues/157#issuecomment-2110497041