Open aaronchantrill opened 1 month ago
Yes, this sounds reasonable. I'm not sure if Whisper in the CPU will be a good experience though.
I will merge in the PRs, and tidy things up.
Whisper on the CPU has not been a good experience for me, which is why I would like to change the default to use_cuda=True.
Here is a block in glados.py:
This causes Whisper.cpp to run on the CPU instead of the GPU, which is quite a bit slower. If there is not enough room on the GPU when attempting to run Whisper on it, GlaDOS just crashes with a segmentation fault, so I understand why setting this to false is the default. I'd like to make this a setting in the glados_config.yml file, so it is slightly easier to find and update. Ideally, it would test to see if your GPU has enough VRAM before attempting to load the Whisper model onto it, but I'm not sure how to implement that.