josStorer / RWKV-Runner

A RWKV management and startup tool, full automation, only 8MB. And provides an interface compatible with the OpenAI API. RWKV is a large language model that is fully open source and available for commercial use.
https://www.rwkv.com
MIT License
5.31k stars 502 forks source link

Linux服务器的webGPU并行问题 #349

Closed shiroko98 closed 5 months ago

shiroko98 commented 5 months ago

我在Linux服务器上部署了此项目,看到使用webgpu可以支持并行响应,在命令行参数使用 python ./backend-python/main.py --port 8001 --host 0.0.0.0 --webgpu,并调用switch_model加载st模型后,还是阻塞式响应,请问还需要修改什么地方吗? switch-model参数: { "model": "...", "strategy": "CUDA fp16", "tokenizer": "", "customCuda": false, "deploy": false }

shiroko98 commented 5 months ago

之前已根据报错安装了web-rwkv-py,之后便可以正常chat,但是是阻塞式响应。但应该还未安装Vulkan,是否有影响?