-
- [x] MiniCPM-Llama3-V-2_5
- [x] Florence 2
- [x] Phi-3-vision
- [x] Bunny
- [x] Dolphi-vision-72b
- [x] Llava Next
- [x] Qwen2-VL
- [x] Pixtral
- [x] Llama-3.2
- [x] Llava Interleave
- [x] …
-
It's effectively being used broader than the VNG/GEMMA & Zaakgericht Werken context - it really is aimed at JSON-based OpenAPI 3 driven services.
Proposal: `oas3-client` (boring but carries the wei…
-
**Qwen2**
warning: not compiled with GPU offload support, --n-gpu-layers option will be ignored
warning: see main README.md for information on enabling GPU BLAS support
Log start
main: build = 2…
-
Hi @danielhanchen
I am trying to fine-tune gemma2-2b for my task following the guidelines of the continued finetuning in unsloth. Howver, I am facing OOM while doing so. My intent is to train gemm…
-
The cookbook aims to provide a comprehensive guide for researchers and practitioners interested in fine-tuning the Gemma model from Google on a mental health assistant dataset.
Key components of th…
-
**Describe the bug**
git lfs pull --include gemma-2-9b-it-Q8_0_L.gguf
vs
git lfs pull gemma-2-9b-it-Q8_0_L.gguf (typed accidentally)
does not make it very clear how many files, or how much data …
-
### 🚀 The feature, motivation and pitch
Gemma-2 and new Ministral models use alternating sliding window and full attention layers to reduce the size of the KV cache.
The KV cache is a huge inferen…
-
David+Whitney
Alexis+Anacona
Tracey+Sazare
Atlas+Siluca
Damar+Aishela
Worapoj+Lalrinkimi
Hector+Harmonica
Takeshi+Kasumi
Wayne+Esmeralda
Arjen+Ketifa
Kanatbek+Lamia
Kazuhiro+Shinobu
Tugsts…
-
I tried some of the `web-ai-demos` on https://chrome.dev/, such as https://chrome.dev/web-ai-demos/perf-client-side-gemma-worker/
Some demos say that the model will take about 30s or 1 minute to lo…
-
### 🚀 Feature
Similar to EOS token, we should offer an option to add BOS token to the beginning. Might be useful for models like Gemma.