Open choidami opened 8 months ago
Hello,
This script can be used to run LLaMA-2 70B on the a subset of the Monash archive datasets: https://github.com/ngruver/llmtime/blob/main/experiments/run_monash.py
You can also run it on your own dataset by adapting this this line https://github.com/ngruver/llmtime/blob/main/experiments/run_monash.py#L84C21-L84C21 with your own train and test data.
Lately we have been using 2 A100 GPUs to run inference with the 70B model. It is also possible to use 6 GPUs with 32 GB of GPU memory. You might also be able to achieve reasonable performance with lower precision weights, which is not something we tested extensively.
Nate
Was there a specific command that was used to run the Llama 70B model? For example to do model-parallelism? What GPU configuration did the authors use?