Gadersd / llama2-burn

Llama2 LLM ported to Rust burn
MIT License
272 stars 17 forks source link

Enable Benchmarking of llama #8

Closed nsosio closed 3 months ago

nsosio commented 9 months ago

Overview

This pull request introduces a benchmarking script for evaluating the performance of LLama's generation process. The script, located at bin/benchmark/main.rs, aims to measure the tokens/second average and standard deviation across multiple runs. The benchmarking is achieved by executing the following command:

cargo run --bin sample <model_name> <tokenizer_filepath> <prompt> <n_tokens> <repetitions>

Changes Made