-
Minor Issue:
No matter what the user chooses for the dropdown of model selection, the first message always uses the default model...Even when the user specifically chose the non-default one. Afte…
-
### System Info
TGI v2.2.0 with the official Docker image.
### Information
- [x] Docker
- [ ] The CLI directly
### Tasks
- [x] An officially supported command
- [ ] My own modifications
### Repr…
-
### What happened?
Hi! 😃
Getting a pricing error trying to call Gemma 2 on groq.
### Relevant log output
```shell
Error calculating completion cost: litellm.NotFoundError: Model not in model_pri…
joch updated
2 months ago
-
## はじめに
transformers のモデルを用いて指示チューニングを行う際、適切な形式で入力データを整形する必要があります。今回の勉強会では、データ変換を効率的に行うための方法について解説します。
## チャットテンプレート
LLM(大規模言語モデル)をチャットモデルとして利用する際、ユーザーやアシスタントなどの役割ごとにメッセージを分けるためのテンプレートが必要です。こ…
-
I am trying to fine-tune llama 3.1 70b. It has [6 safetensors](https://huggingface.co/unsloth/Meta-Llama-3.1-70B-bnb-4bit/tree/main) making a total of 39.52GB.
I have a total of **57.6GB of DISK*…
-
### Feature Request
Gemma2 support
👉👉👉[我的哔哩哔哩频道](https://space.bilibili.com/3493277319825652)
👉👉👉[我的YouTube频道](https://www.youtube.com/@AIsuperdomain)
### Motivation
Gemma2 support
👉👉👉[我的哔哩哔哩频…
win4r updated
3 months ago
-
### 🚀 The feature, motivation and pitch
Description:
Qwen2.5 (32B) is a state-of-the-art model, especially interesting in 4-bit precision (bitsandbytes).
- I tried integrating it, but the model d…
-
When doing inference on Gemma-2-2B with Flash Attention 2, I get the following error. It works just fine with Flash Attention disabled.
transformers==4.44.0
torch==2.4.0
flash-attn==2.6.3
python…
-
Hi,
I am wondering whether this support training for second gen Gemma based llava.
When we trained with the new Gemma with this repo, we ran into an error of:
TypeError: '
-
Perlexica is permanently stuck on "Answer" loader. As simple as that. Out of tens of attempts it might have worked only two times.
- Using a tiny `llama3.2:1b-instruct-q6_K` model.
- `ollama ps` sho…