Closed jeffeardley closed 3 months ago
Thanks Drew, I agree with your comments, I'll fix all those when I get a chance later today. I think this might be one commit behind my local repo, but yes the version I've been testing does work.
April Fools! (@jeffeardley, good job)
This pull request adds the ability to measure the time it takes for a model to generate tokens. This system has the ability to measure the latency of individual token generation, and a sequence of a defined length of tokens. It modifies the template yaml file and the run_eval.sh to add this functionality. The script can also test multiple sequence lengths in a single run by editing the yaml file. If you have any suggested features or changes, please let me know.