Closed 0x77dev closed 3 months ago
The current ghcr.io/bentoml/openllm:latest
image (sha256:1860863091163a8e8cb1225c99d6e1b0735c11871e14e8d8424a22a5ad6742fa
) shows an error:
ValueError: The checkpoint you are trying to load has a model type of `cohere`, which Transformers does not recognize. This may be due to a problem with the checkpoint or an outdated version of Transformers.
when doing this:
docker run --rm --gpus all -p 3000:3000 -it ghcr.io/bentoml/openllm start CohereForAI/c4ai-command-r-v01 --backend vllm
also when installing openllm[vllm]
it brings 0.2.7 version of vLLM
Though vLLM version in main branch is 0.4.0: https://github.com/bentoml/OpenLLM/blob/main/openllm-core/pyproject.toml#L83 and https://github.com/bentoml/OpenLLM/blob/main/tools/dependencies.py#L157
I think this should be the same prompting system, there is also CohereForAI/c4ai-command-r-plus
available and it would be nice to be able to run it too.
should be supported on main now. Will release a new version soon.
Feature request
Would be nice to have ability to run Command-R (
CohereForAI/c4ai-command-r-v01
) using OpenLLMMotivation
No response
Other
vLLM backend already supports Command-R in v0.4.0: https://github.com/vllm-project/vllm/issues/3330#issuecomment-2041225404