-
**Describe the bug**
量化通义千问模型完成后,加载量化后的模型执行推理报错:
```python
from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
from transformers import AutoTokenizer
pretrained_model_name = "qwen-v…
-
### 是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
- [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions
### 该问题是否在FAQ中有解答? | Is there an existing…
-
huggingface的Qwen2.5或llama-3.2-11B好像不支持。
是不是可以询问时先跑搜索,然后把答案包在 role: "system" 一起给后面API。
或是多跑一圈让一个API负责转换成关键字搜索,另一个API让他复述答案。
-
### Your current environment
Collecting environment information...
PyTorch version: 2.1.2+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Ubunt…
-
### Description
```
AI_APICallError: Failed to process successful response
at postToApi (/Users/liho/Desktop/epray/hydrogen-sep/epray/node_modules/@ai-sdk/openai/node_modules/@ai-sdk/provider-u…
-
我测试了本地调用本地,环境如下:openai 1.23.6
pydantic 2.8.2
请求无法到达client定义的地址,我的env文件设置如下:OPENAI_BASEURL='http://192.168.9.56:3001/v1'
OPENAI_API_KEY='sk-EpLp5jA6zHXAfQ…
-
### When the type of context in the incoming messages is text, an error occurs.
**API**: `/v1/chat/completions`
### request
```json
{
"max_tokens": 0,
"model": "qwen-72b-chat-int4"…
-
### Is there an existing issue for this?
- [X] I have searched the existing issues
### Current Behavior
1、请问部署流式api,出现报错:
2、如何获取类似与GPT4的openai.api_key呢,如果在其他模型中需要使用openai.api_key,直接填"none"吗
### E…
-
部署的模型调用方式如图
![Uploading glm-6b调用.jpg…]()
-
### 起始日期 | Start Date
_No response_
### 实现PR | Implementation PR
_No response_
### 相关Issues | Reference Issues
_No response_
### 摘要 | Summary
通过fastapi 部署web api 服务,便于远程调用和批处理
### 基本示例 | Basic…