Linux服务器的webGPU并行问题

josStorer / RWKV-Runner

A RWKV management and startup tool, full automation, only 8MB. And provides an interface compatible with the OpenAI API. RWKV is a large language model that is fully open source and available for commercial use.

https://www.rwkv.com

MIT License

5.07k stars 484 forks source link

Linux服务器的webGPU并行问题 #349

Closed shiroko98 closed 3 months ago

shiroko98 commented 3 months ago

我在Linux服务器上部署了此项目，看到使用webgpu可以支持并行响应，在命令行参数使用 python ./backend-python/main.py --port 8001 --host 0.0.0.0 --webgpu，并调用switch_model加载st模型后，还是阻塞式响应，请问还需要修改什么地方吗？ switch-model参数： { "model": "...", "strategy": "CUDA fp16", "tokenizer": "", "customCuda": false, "deploy": false }

shiroko98 commented 3 months ago

之前已根据报错安装了web-rwkv-py，之后便可以正常chat，但是是阻塞式响应。但应该还未安装Vulkan，是否有影响？