-
| 任务 | PR | 进度 |
| ---| --- | --- |
| 从 ai-proxy 插件中抽离 provider 逻辑,以便其他插件进行复用 | | ⏳未开始 |
| 支持请求失败时,对本次请求立即进行重试 | | ⏳未…
-
# 模型参数支持专区
大家好,PaddleNLP 团队在这里为大家整理了各个模型参数的详细信息,方便大家使用。
## 模型参数
### Base Models
| Model | 0.5B | 1~2B | 3~4B | 6~8B | 13~14B | 30~32B | 50~60B | 65~72B | 110B | >110B |
|:---------:|:--…
-
i use `Int8DynActInt4WeightQATQuantizer` to quantize qwen2-1.5B model. But after prepare function, i find that bias is set to False.
This is my Code
```
from torchtune.models.qwen2 import qwen2_1_…
-
I really like this model, but do you guys have plans to make another based off of the 14b, 32b Qwen 2.5 models perhaps?
-
用官方qwen-max或者qwen-long不存在这个问题,难道对开源模型有限制?
-
Hello, thanks for your great work. I have some little questions.
When testing a Qwen2 based model, like `llava_qwen` or `lmms-lab/LongVA-7B`, on V-NIAH benchmark,
there is a function [apply_seq_…
-
https://qwenlm.github.io/blog/qwen2.5/
https://huggingface.co/collections/Qwen/qwen25-66e81a666513e518adb90d9e
-
Highest capability models that can run on latest iPhone would be useful. The best I've found to fit in 8gb RAM is Qwen 2.5 7B 4bit?
-
Again, thank you for your work!
I think this project does not have the attention it should. This is by far the best OS model that can serve as a general agent in my use cases.
Would be curious to …
-
### Please check that this issue hasn't been reported before.
- [X] I searched previous [Bug Reports](https://github.com/axolotl-ai-cloud/axolotl/labels/bug) didn't find any similar reports.
###…