Closed DmitriiMS closed 8 months ago
Switching between models is not going to work fast anyway. I'd better emulate two cards on a single nvidia card and run two dockers. Something like
https://docs.nvidia.com/grid/13.0/grid-vgpu-user-guide/index.html
It takes abou 3-5 seconds to switch models, in my case it was acceptable. I figured out that I can run 3 servers with different ports open inside docker container. It works fine. Instaling vGPU drivers would be harder imo. I'm closing the issue.
Hello. I use gpu version of Vosk server and I would like to be able to switch between models on the fly, mainly EN (one that comes with the docker container) and RU (vosk-model-ru-0.42). I added volume with the model to the docker container and it runs great with either model. I also modified asr_server_gpu.py so it is able to take parameter and switch model based on that:
Switching models works fine for a couple of times, but it's almost guaranteed that after switching from EN model to RU model I recieve this error:
I understand that this is a Kaldi error, but maybe I am doing something wrong with Vosk server?
Using several docker containers is pretty hard, because they compete for the only GPU I have (it doesn't have MIG and other options are pretty hard to imlement). Loading multiple BatchModels leads to segmentation fault (which is expected). Non gpu server works fine in such usecase, but it's pretty slow.
Is there anything I can do to make this dynamic switching work? Should I take this issue to kaldi repo? Or is it better to implement a workaround, e.g. killing server inside container and relaunching it with new model?