mallorbc / whisper_mic

Project that allows one to use a microphone with OpenAI whisper.
MIT License
705 stars 158 forks source link

Issues with Python Setup #61

Open ACDrafahl opened 9 months ago

ACDrafahl commented 9 months ago

First off, I want to say thank you for making this. It's been a lifesaver so far.

Second, I'm very new to this kind of project and python in general, so I apologize if this question is obvious or nonsensical. The CLI commands are great, but I'm trying to do the same setup in python (specifying the device, the model, the mic, etc.). I know that the init function sets everything to a default value, but I was wondering if there was a way to set these qualities manually in a separate python file so that any user can download my code and have it work with your whisper_mic.py file out of the box. I also wondered about how to find the mic index that I need and how to set the FP16/FP32/INT8 options. I keep getting a warning that FP16 isn't supported on my cpu, which causes it to default to FP32. I'd like to set it to FP32 from the start. If I have to modify the whisper_mic.py file itself, I understand, but I just wanted to make sure there wasn't any other way.

mallorbc commented 8 months ago

You could make it so that the code takes a config file instead of arguments.

If you make this PR it would work and I will merge:

  1. Make a pydantic object that configures how the software works
  2. Have a way to change those values with the cli flags
  3. Have a way to pass a json file to configure the software

Normally the default mic index works. Otherwise you can print out the mic devices and select the index that makes the most sense.

Other backends could be added to support int8 such as transformers or ctranslate2. For this current work fp32 is the default for cpu and fp16 for GPU