-
When using model_worker with transformers to run Gemma 2 9B model does not work correctly and the conversation template applied to Gemma 2 model continue to generate response until model_worker is kil…
-
I get the following error when I try to do a search (especially when in Expert mode):
```500: {"error":"json: cannot unmarshal string into Go struct field ChatRequest.messages of type api.ToolCallF…
-
**❗BEFORE YOU BEGIN❗**
Are you on discord? 🤗 We'd love to have you asking questions on discord instead: https://discord.com/invite/a3K9c8GRGt
**Describe the bug**
I have followed the page of "htt…
-
3090报内存不够?有大师试过吗?
-
# Fix for gemma-2-9b - run with blfloat16
![image](https://github.com/ObrienlabsDev/machine-learning/assets/24765473/4e149bf2-e84e-48a8-b3bc-1939d1543f66)
https://huggingface.co/google/gemma…
-
Some models (e.g. Gemma 2) that can be loaded with TabbyAPI/exllama2 do not accept system prompts and throw an exception if a system prompt is included in the api call. E.g. Gemma2 Tokenizer.json) "ch…
-
### 🐛 Describe the bug
my code
```python
import os
from mem0 import Memory
# os.environ["OPENAI_API_KEY"] = "none" # for embedder
config = {
"llm": {
"provider": "ollama",
…
-
### 🚀 The feature, motivation and pitch
Some models(Gemma2 ...) uses hybrid attention, global-attention + local-attention. But vllm currently ignores local-attn and uses global-attn.
By simpl…
-
How to use the llama3.1 with Ollama? Do you support it?
-
Currently on 3 chat templates is present: https://github.com/TanvirOnGH/vscode-ollama-modelfile/blob/dev/snippets/modelfile.json#L37-L104.
## TODO Templates
- [x] ChatML (ccd461ac30c116110a7adda50…