jdrbc / podly_pure_podcasts

Ad-block for podcasts
MIT License
208 stars 8 forks source link

Separation of transcription and scoring openai clients #24

Open xerootg opened 3 days ago

xerootg commented 3 days ago

Hey! For a while, I've been running a fork of this wonderful tool, and it's great to see the maturity of it overall grow. I'm using gemma2 on ollama and faster-whisper-server to run the backend with great results. I've got a mildly tweaked system message which has given me great results with gemma2 for 3ish weeks now, and I'd like to contribute that back. However, the issue of faster-whisper-server - the mode is hardcoded to Whisper-1 and shares an openai client with the scoring half, meaning one API base, meaning I need to run a reverse proxy in addition to ollama, faster-whisper-server and podly_pure_podcasts. Is there a reason you have one openai client for both transcription and scoring? Are you opposed to a PR that would separate the two if a custom whisper server is provided? There's some other tweaks I have in my fork of faster-whisper-server that would probably be good to handle in podly instead, so others who are also running their own wisper instance can use this.

jdrbc commented 1 day ago

Hey! That sounds awesome. I've no objections to two clients. @frrad has done some abstraction on the transcription portion to support local & remote. Also possible to add a third implementation of Transcriber if it's not so easy to just drop in replace the open ai client.