fedirz / faster-whisper-server

https://hub.docker.com/r/fedirz/faster-whisper-server
MIT License
644 stars 90 forks source link

Security / Bug / Clarification: a user request can force the server to download a payload / overload disk space, no? #106

Open thiswillbeyourgithub opened 2 weeks ago

thiswillbeyourgithub commented 2 weeks ago

Hi,

I noticed that I'm confused.

As the owner of the server that can control the config and environment variables, if I set WHISPER__MODEL=tiny because I have a very small server, then I expect that's the model my server will use. But what seems to be happening is that if a user sends a request with model=large-v3 then my server would start downloading and loading the new model!

I think this can be useful in some situations but I think it should be opt in (= disabled by default).

Here are my most compelling reasons:

  1. I don't want users to be able to try 20 models of hugging face and overload my disk space.
  2. I don't want 3 users to be able to load 3 different models. Concurrent requests would be an issue.
  3. Some apps call whisper themselves and don't allow changing the model, or have a hardcoded "whisper-1", even though the owner specified a model.
  4. I could be mistaken but it could be a security risk: simply using "model=hackerhfaccount/corruptedmodel` in a request would make the server download the payload. Right? It could even maybe crash my openwebui instance if I set it to depend on faster-whisper-server being healthy.

My suggested solution is to add an environment variable HONOR_REQUEST_MODEL that defaults to False but if True would do what's currently implemented.

Other solutions could be:

What do you think? Also, the env variable should instead be "WHISPER__DEFAULT_MODEL" don't you think?

fedirz commented 2 weeks ago

I agree with the things you've pointed out. Let me think of the best way to go about implementing this. Thanks for creating the issue!

Edit: I'll likely add a mode where only the downloaded models can be used.