-
/kind bug
**What steps did you take and what happened:**
Bump vLLM version in huggingface_server to v0.5.1 so that kserve supports new gemma2 models through vllm
-
**Description:**
One of the reasons Ollama is so widely adopted as a tool to run local models is its ease of use and seamless integration with other tools. Users can simply install an app that star…
-
### 🚀 The feature, motivation and pitch
Now vLLM gemma2 does not support ROPE scaling, and I sincerely hope that support for it will be added in the future.
-
### Your current environment
how to initiate the gemma2-27b with a 4-bit quantization?
### How would you like to use vllm
Could you please explain how to initiate the gemma2-27b with a 4-bit quanti…
-
### Describe the bug
GraphRAG parsing parameters missed completely the controlled parameters for LLM, such as temperature, n, top_p. Although these are in the settings.yaml file (described below) but…
-
### Your current environment
```PyTorch version: 2.3.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Ubuntu 22.04.4 LTS (x86_64)
GCC ve…
-
`model = LanguageModel("gpt2")`
The `help` for `LanguageModel` doesn't say which other models are in the default namespace (for instance, can I do "llama3"? "gemma2"? etc). It would be nice to see …
-
### Your current environment
Collecting environment information...
PyTorch version: 2.3.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Ubunt…
-
### What is the issue?
when i try to run this command ollama run gemma2 this error shows up.
### OS
Windows
### GPU
_No response_
### CPU
Intel
### Ollama version
0.2.5
-
Consider the following models:
- LLaMa 3 8B
- LLaMa 2 7B
- OLMo 7B
- Gemma 7B
- Aya 23 8B
- Gemma2 9B
Changed to