beeldengeluid / dane-whisper-asr-worker

MIT License
2 stars 0 forks source link

1 main code #18

Closed greenw0lf closed 4 months ago

greenw0lf commented 5 months ago

Closes #1

Added the main code that runs Whisper on audio to generate transcriptions.

When testing, I recommend using the CPU for processing (it is already set to that in the config.yml). If you happen to have a more powerful Nvidia GPU available, then you can change the value to cuda and test it (I will also test it myself using a GPU I have available).

I have also already tested the vad and word_timestamps settings and it seems to be working fine.

Veldhoen commented 5 months ago

Oh and of course it would be really great to add some automated tests, for instance for the config settings.

greenw0lf commented 5 months ago

I tried to run it on GPU via Docker, but it's more complicated than I thought. It expects CUDA version 11.x, but I tried several things and none of them work. I might have to do a multi-stage build in order to make it work. I could test if the GPU works the S3 way since, in that case, it will run locally, without needing Docker (if I remember correctly).

Otherwise, if you have ideas on how to address this, let me know

I will also be adding tests, but for another issue