ngruver / llmtime

https://arxiv.org/abs/2310.07820
MIT License
628 stars 139 forks source link

How to run Llama 70B? #14

Open choidami opened 8 months ago

choidami commented 8 months ago

Was there a specific command that was used to run the Llama 70B model? For example to do model-parallelism? What GPU configuration did the authors use?

ngruver commented 7 months ago

Hello,

This script can be used to run LLaMA-2 70B on the a subset of the Monash archive datasets: https://github.com/ngruver/llmtime/blob/main/experiments/run_monash.py

You can also run it on your own dataset by adapting this this line https://github.com/ngruver/llmtime/blob/main/experiments/run_monash.py#L84C21-L84C21 with your own train and test data.

Lately we have been using 2 A100 GPUs to run inference with the 70B model. It is also possible to use 6 GPUs with 32 GB of GPU memory. You might also be able to achieve reasonable performance with lower precision weights, which is not something we tested extensively.

Nate