-
### Describe the bug
> Draw a sine function
Plan:
1 Import necessary libraries in Python.
2 Generate x values.
3 Calculate corresponding y values using the sin function.
4 Plot…
-
分别跑了Qwen1.5-1.8B-Chat和Qwen-1_8B-Chat,报了类似的问题:
以Qwen1.5-1.8B-Chat举例:
使用tramsformers==4.31.0:
```
Traceback (most recent call last):
File "/data_sdb/demos/mnn-llm/models/llm-export/llm_export.py"…
-
### Installation Method | 安装方法与平台
OneKeyInstall (一键安装脚本-windows)
### Version | 版本
Latest | 最新版
### OS | 操作系统
Windows
### Describe the bug | 简述
Traceback (most recent call last):
File ".\requ…
-
### 🐛 Describe the bug
When I want to train qwen2.5-7B-instruct with using deepspeed, it shows the following erre:
```
Traceback (most recent call last):
File "/home/work/ybs/deeplm/LLM/train.py…
-
I want it to work on my existing project with multiple code files and with nested folders and multimodality with local models like ollama and lite-llm
-
### System Info
- CPU Architecture: x86_64
- CPU/Host memory: 450 GiB
- GPU Name: NVIDIA A100
- GPU Memory Size: 80 GB
- TensorRT-LLM Branch: 0.7.1
- CUDA: 12.0
- Driver Version: 525.85.12
…
-
### System Info
`
text-generation-launcher 2.1.0
`
### Information
- [X] Docker
- [X] The CLI directly
### Tasks
- [ ] An officially supported command
- [ ] My own modifications
### Reprod…
-
By default all the requests made to LLMs (whether API based or open-source) entire completion response is generated before sending response to client. This creates bad user experience.
Enter Stream…
-
solved by #19
Currently it takes a lot of time to start the model, implement caching to improve loading speed.
ps: please create a PR on the local-llm branch and not the main branch.
-
### Issues Policy acknowledgement
- [X] I have read and agree to submit bug reports in accordance with the [issues policy](https://www.github.com/mlflow/mlflow/blob/master/ISSUE_POLICY.md)
### Where…