[Help] API部署的情况下，怎么打断模型执行？终止生成 - Githubissues

THUDM / ChatGLM2-6B

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型

Other

15.73k stars 1.85k forks source link

[Help] API部署的情况下，怎么打断模型执行？终止生成 #622

Open mymynew opened 1 year ago

mymynew commented 1 year ago

Is there an existing issue for this?

[X] I have searched the existing issues

Current Behavior

API部署，一次生成有时候耗时很长。这种情况下，想终断模型生成，该调用啥接口能实现终止执行？

Expected Behavior

No response

Steps To Reproduce

...

Environment

- OS:ubuntu
- Python:3.10
- Transformers:
- PyTorch:2.0.1
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :true

Anything else?

No response