Closed icoderzqliu closed 6 months ago
Hello, when I run the evaluation of the 70b model, I used 8*80G GPUs, but it still oom. How can I evaluate the the 70b model?
hello, I have met the same problem, may I ask how you solve it?
You can set --max_memory_per_gpu flag to auto and it will shard the model on the available gpus. You can also reduce the batch size to 1, and use bf16 --precision bf16 and if it still OOMs, you can try reducingmax_length_generation
--max_memory_per_gpu
auto
--precision bf16
max_length_generation
Hello, when I run the evaluation of the 70b model, I used 8*80G GPUs, but it still oom. How can I evaluate the the 70b model?