Sharrnah / whispering

Whispering Tiger - OpenAI's whisper (and other models) with OSC and Websocket support. Allowing live transcription / translation in VRChat and Overlays in most Streaming Applications
MIT License
401 stars 29 forks source link

Implement error handling/retrying when downloading models #5

Closed Infinitay closed 2 years ago

Infinitay commented 2 years ago
###################################
# Whispering Tiger is starting... #
###################################
Websocket: Server started.
Initializing medium NLLB-200 model.
Downloading medium NLLB-200 model...
Websocket: Client connected.
  1% (47431680 of 3070177447) |                                                  | Elapsed Time: 0:00:06 ETA:   0:05:43Websocket: Client connected.
Websocket: Client connected.
 17% (528465920 of 3070177447) |#######                                  | Elapsed Time: 0:03:16 ETA:  14 days, 2:27:20<urlopen error retrieval incomplete: got only 528465069 out of 3070177447 bytes>
Whisper AI Ready. You can now say something!

Had this happen 3-4 times now. I would imagine Whispering is not ready for use because the model failed to download. Currently I'm downloading it within .cache/nllb200/ myself. Probably should have done that the first time the download failed. As of writing this issue, the download froze for around a minute and resumed again while downloading the model via my browser multiple times. Is there a limit on the storage provider?

Does the zip file just contain the model's checkpoint? If so can I just use the one that is directly provided by Meta? If that's the case, I'm not sure why there's a need to store the models on your server aside from the compression, which will just get uncompressed anyways.

Sharrnah commented 2 years ago

Sorry to hear that you have issues downloading the model.

Unfortunately since it is using the transformer implementation from huggingface, it will not work with just the checkpoints provided by meta.

It should also work by downloading the files from https://huggingface.co/models?search=facebook/nllb and put them to the correct location.

Since huggingface only allows downloading if you have an account on huggingface and have set up a token, i thought it will be too difficult for most people to get it running, so i decided to host the files myself. There should be no limit except a speed limit by default at about 10 MByte/s (=80 Mbit/s).

I will see if i can add some retry mechanism (or even better, that it just continues where it stopped) or an automatic switch to the US server. (i haven't uploaded it to the US Server yet because i have no automatic switching)

or if its too bad maybe i will host it somewhere else.

Can i ask if you are more closely located to the US or to the EU? And are you using a WLAN connection or are you connecting via LAN-Cable?

Infinitay commented 2 years ago

US, connected directly via ethernet

I'll keep note in the future in case there are new models that I should use the ones from huggingface thank you.

Sharrnah commented 2 years ago

@Infinitay can you test if downloading from the US server is more stable?

NLLB-200 medium size: https://usc1.contabostorage.com/8fcf133c506f4e688c7ab9ad537b5c18:ai-models/NLLB-200%2Fmedium.zip

btw. the NLLB-200 model is not required to run the application. Its just the text translation model for further translation of the whisper results.

Infinitay commented 2 years ago

can you test if downloading from the US server is more stable?

Worked with no apparent issue.

btw. the NLLB-200 model is not required to run the application. Its just the text translation model for further translation of the whisper results.

I understand but Argos is horrible for Korean. I was resorting to transcribing and manually copy-pasting the transcriptions into Papago's translator. NLLB-200 seems to be doing very well from what little tests I did.

Sharrnah commented 2 years ago

thank you for your help. Next version will have retries and "fallback" to the US Server and checksum check after download.

So i hope it is okay that i close this.

Feel free to open a new one if you find anything else.