Open yeahdongcn opened 3 months ago
User story: I want to benchmark ollama running inside a docker container. I would prefer to install a venv or conda env with llm_benchmark in a different host or container.
Ollama API doc: https://github.com/ollama/ollama/blob/main/docs/api.md#pull-a-model
I sent a PR to query device information through Ollama API: https://github.com/ollama/ollama/pull/5479 It could be used to replace GPUtil to check the available VRAM.
User story: I want to benchmark ollama running inside a docker container. I would prefer to install a venv or conda env with llm_benchmark in a different host or container.
Ollama API doc: https://github.com/ollama/ollama/blob/main/docs/api.md#pull-a-model