-
![image](https://github.com/user-attachments/assets/e10e178f-e010-42c3-a306-57f95672735d)
-
### System Info
2024-11-26T11:36:19.229621Z INFO text_generation_launcher: Runtime environment:
Target: x86_64-unknown-linux-gnu
Cargo version: 1.80.1
Commit sha: d2ed52f531cf8098ca62375248e00702…
-
### Describe the bug
When I select Ollama and select the Qwen2.5-coder:32b (32.8B) model and try and click enhance prompt. I get Invalid or missing provider. I am guessing that it's looking for an AP…
-
### OS
Linux
### GPU Library
CUDA 12.x
### Python version
3.10
### Pytorch version
xxxxxxxxxxx
### Model
turboderp/Mistral-7B-instruct-exl2
### Describe the bug
## Warning: Flash Attention…
-
Thanks for your great work! Could you additionally evaluate the Qwen2.5 and Qwen2 models? They also support 128K context length.
-
### System Info
TensorRT-LLM] TensorRT-LLM version: 0.14.0
0.14.0
229it [00:59, 3.88it/s]
Traceback (most recent call last):
File "/app/tensorrt_llm/examples/qwen/convert_checkpoint.py", line 303,…
-
### System Info
A100
### Who can help?
_No response_
### Information
- [x] The official example scripts
- [ ] My own modified scripts
### Tasks
- [x] An officially supported task in the `exampl…
-
I have built a small example using the python binding here https://github.com/tarekziade/onnxruntime-test/blob/main/run.py
to measure the inference speed on my Apple M1 and on a windows 11 box, using …
-
```
from trl import SFTTrainer
from transformers import TrainingArguments, DataCollatorForSeq2Seq
from unsloth import is_bfloat16_supported
trainer = SFTTrainer(
model = model,
tokeniz…
-
### What happened?
Running the following code results in an error. How can it be fixed?
code:
```
import litellm
from litellm import CustomLLM, completion, get_llm_provider
class Qwen(Cust…