m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
BSD 2-Clause "Simplified" License
12.69k stars 1.35k forks source link

Unable to run whisperx offline #299

Open ll8083129 opened 1 year ago

ll8083129 commented 1 year ago

Hello, I'm trying to set up an offline enviroment with multiple whisperx instances on virtual machines.

Firstly I am running everything online, model large-v2 is being downloaded to cache. After disconnecting Internet, despite adding --model_dir {model Localization} parameter, whisperx still trying to connect to huggingface.co and transcription ends up with errors:

requests.exceptions.ConnectionError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /api/models/guillaumekln/faster-whisper-large-v2/revision/main (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001F6903D5FF0>: Failed to resolve 'huggingface.co' ([Errno 11001] getaddrinfo failed)"))

sorgfresser commented 1 year ago

Maybe #263 can help you

nkilm commented 6 months ago

I have developed a custom script based on WhisperX that enables running various pipelines entirely offline. All you need to do is download the pre-trained models and specify their PATHs in the script.

Repo: https://github.com/nkilm/offline-whisperx