-
Testing different models, mainly gemma 2, i have been receiving a lot of blank responses (no line, no spacing, just blank no characters at all), usually a few regens fixes it but sometimes it takes qu…
-
I am seeing the error `ModuleNotFoundError: No module named 'transformers.models.gemma2'` when trying to import unsloth after upgrading to version 2024.7. How can I install an older version of unsloth…
-
Since saving models to GGUF format right now is broken because of abysmal differences in performance, I am importing my fine-tuned model to Ollama from it's LoRa adapters but the inference results I'm…
-
### What is the issue?
The `mixtral:8x22b-instruct` model does not work correctly when only the system prompt is provided. In such cases, an empty prompt is sent, leading to irrelevant output.
T…
-
hey, while running on 4bit quantized model from https://huggingface.co/ThetaCursed/Ovis1.6-Gemma2-9B-bnb-4bit i am getting the following error
```
{
"name": "RuntimeError",
"message": "self an…
-
### 🚀 The feature, motivation and pitch
Some models(Gemma2 ...) uses hybrid attention, global-attention + local-attention. But vllm currently ignores local-attn and uses global-attn.
By simpl…
-
Hi,
I'm experiencing issues while running GEMMA2 on Ollama. Specifically, I'm encountering the following problems:
Error on Message Generation:
Whenever a new chat is created, every message…
-
I tried to run this with Gemma 2 27b it and found that it doesn't quite work. I verified that everything works with qwen/qwen-1_8b-chat.
I get this error message:
```Assertion error: All scores…
-
## 端测rag向量数据库
Gemma2 2B 才刚掀起了端侧模型热,端侧 RAG 就来了!🔥🤯
C 语言实现,基于 SQLite 插件的「最快⚡」端侧向量数据库 sqlite-vec 开源,短时间暴涨 1.8K Star⭐
- 处理50万个 960 维向量仅 41 毫秒
- 支持 JS/Rust,支持 Llama.cpp 离线Embedding 和在线Embedding
…
alwqx updated
1 month ago
-
I train lora lm_head layer with peft we get lora adapter only for last layer, but this lora adapter seems to be dropped at inference because at conversion it has a wrong name (see #5 for code).
If…