djmango / obsidian-transcription

Obsidian plugin to create high-quality transcriptions from markdown linked audio files
https://swiftink.io
MIT License
153 stars 17 forks source link

ASR Diarization Endpoint #25

Open dahifi opened 11 months ago

dahifi commented 11 months ago

So looks like the ASR repo is adding whisperx support, which adds diarization options.

https://github.com/ahmetoner/whisper-asr-webservice/pull/125

I def want to add this but wanted to touch base on how you wanted to do it. I'm thinking we need an additional option in the command palette to trigger the transcribe option, but not sure yet if we need to use the min/max speaker options as well.

I'll keep you posted once the ASR PR is done.

AustinSaintAubin commented 11 months ago

As a community thought, I would think having a min/max/auto option for quantity of speakers would be goods.

djmango commented 11 months ago

An additional option only available when ASR is selected would make the most sense. Don't have bandwidth for major additions like this at the moment, will review PR.

dahifi commented 11 months ago

Roger that. The ASR PR has some problems and could be some time for me as well. I'll keep you posted when there's an update.

dinhe878 commented 4 months ago

Not sure if this is the right place to ask but could I ask what ASR url should I provide in the plugin setting? I tried hosting a docker run locally and provided ip address (e.g. http://123.123.123.123:9000/asr) but didn't work.

Any help is appreciated!