-
### Describe the bug
Traceback (most recent call last):
File "F:\gptai\text-generation-webui-snapshot-2024-04-28\modules\callbacks.py", line 61, in gentask
ret = self.mfunc(callback=_callback…
-
### Reminder
- [X] I have read the README and searched the existing issues.
### System Info
```
- `llamafactory` version: 0.9.1.dev0
- Platform: Linux-5.19.0-0_fbk12_zion_11583_g0bef9520ca2…
-
### Your current environment
Collecting environment information...
PyTorch version: 2.3.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Cen…
-
## Description
For some reason of all my models Gemma2 doesn't get good vram estimates.
```
ollama show gemma2:27b-instruct-q6_K
Model
architecture gemma2
parameters …
-
Is it possible to implement this for other models from huggingface apart from LLaMa and Mistral?
-
Seems like the latest changes for supporting ShieldGemma (Gemma 2 classification model) aren't working in 0.8.0. I have the dependency and did a copy paste from your example but still I got:
```
C…
-
Hi everyone:
To use Groq, just add in **models.py**:
```
url = 'https://api.groq.com/openai/v1/chat/completions'
groq = dict(type=GPTAPI,
model_type='gemma2-9b-it',
k…
-
# Progress
- [x] Implement TPU executor that works on a single TPU chip (without tensor parallelism) #5292
- [x] Support single-host tensor parallel inference #5871
- [x] Support multi-host ten…
-
Currently when I am running gemma2 (using Ollama serve) on my device by default only 27 layers are offloaded on GPU, but I want to offload all 43 layers to GPU
Does anyone know how I can do that?
-
**Describe the bug**
gemma2 2b is not available for selection in the models download menu
https://ollama.com/library/gemma2:2b