Closed ccclyu closed 10 months ago
Hello,
We used only one A100 80GB GPU for inference using YaRN-Mistral-7B, and for most tasks, it takes around 10 minutes per example with our implementation eval_yarn_mistral.py
. So, about 8 hours for Retrieve.KV for instance.
Very useful benchmark! May I ask how long did it take when you had inference on these tasks using YaRN-Mistral-7B? Did you only use one A100 80GB GPU for inference?