hi, thanks for your great work! In your paper, you said you use 2 A100 GPUs and hf accelerate to evaluate Llama2-70b, I want to know you just use original accelerate or use accelerate + deepspeed? Since your repo has no content about this, so I'm a little confused, thanks for your patience!
hi, thanks for your great work! In your paper, you said you use 2 A100 GPUs and hf accelerate to evaluate Llama2-70b, I want to know you just use original accelerate or use accelerate + deepspeed? Since your repo has no content about this, so I'm a little confused, thanks for your patience!