-
### Self Checks
- [X] I have searched for existing issues [search for existing issues](https://github.com/langgenius/dify/issues), including closed ones.
- [X] I confirm that I am using English to fi…
-
### Your current environment
```text
PyTorch version: 2.3.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Ubuntu 22.04.4 LTS (x86_64)
GCC …
-
When I installed the demo according to https://github.com/agiresearch/AIOS readme, when I ran the real demo -> `python main.py --llm_name ollama/qwen:7b`, an error message appeared: ModuleNotFoundErro…
-
### Class | 类型
大语言模型
### Feature Request | 功能请求
请求支持一下ollama,ollama的生态现在比较好。支持的模型也多,可以考虑支持一下,谢谢!
-
### 前置确认
- [X] 我确认我运行的是最新版本的代码,并且安装了所需的依赖,在[FAQS](https://github.com/zhayujie/chatgpt-on-wechat/wiki/FAQs)中也未找到类似问题。
### ⚠️ 搜索issues中是否已存在类似问题
- [X] 我已经搜索过issues和disscussions,没有跟我遇到的问题相关的issu…
-
Qwen-7B-Chat:Precision (INT4_SYM) + Input/output token (1024/128) can run on ARC with below number.
![image](https://github.com/intel-analytics/ipex-llm/assets/97716131/b3b4def9-b004-4319-b6bd-04d2b0…
-
### Reminder
- [X] I have read the README and searched the existing issues.
### System Info
对qwen2-72b-instruct 训练完成后 并且量化gptq-4位,使用以下命令部署没有问题,问答也ok
CUDA_VISIBLE_DEVICES=0,1 API_PORT=7864 llamafac…
-
server-1 | 2024-06-06 06:30:28 v1/message.go:88 | [INFO] waitResponse ...
server-1 | 2024-06-06 06:30:28 gin.handler/basic.go:101 | [ERRO] response error: runtime error: invalid memory address or…
-
### ⚠️ 搜索是否存在类似issue
- [X] 我已经搜索过issues和disscussions,没有发现相似issue
### 总结
建议作者团队能支持将用户自己的大模型接入进来
### 举例
_No response_
### 动机
项目很好,我自己从事LLLM方面的研究,建议能够将自己的大模型部署接入到微信中
-
### 软件环境
```Markdown
- paddlepaddle:develop
- paddlepaddle-gpu: develop 11.8
- paddlenlp: lastest 4609d07a54ab97974b962b536dde7164ab15db93
```
### 重复问题
- [X] I have searched the existing issue…