hyperaudio / ha-converter

Hyperaudio Converter - converts from JSON/SRT to HTML Based Interactive Transcript
https://hyperaud.io/converter/converter.html
12 stars 12 forks source link

🔰 Request: Support Importing WhisperX JSON #28

Open natelawrence opened 1 month ago

natelawrence commented 1 month ago

In a recent hunt for more ASR providers who offer per-word timecodes I found some that I already knew of and a few I hadn't heard of before. Among all providers is WhisperX.

We are all familiar with OpenAI's Whisper technology, however those default models only produce timecodes for long phrases of words and the timecodes are not very accurate. WhisperX is a fork of Whisper that provides timecodes of greater accuracy with beginning and end timecodes for every word in the transcript.

You can generate test data with a free demo of WhisperX.

Or, if you would like test data that is already generated: ASR Timed Text Format Test 2 [WhisperX].json The corresponding audio file can be obtained here.

Being able to import WhisperX's format would allow WhisperX users to bring their transcripts and edit them in HyperAudio Lite Editor, if desired.

natelawrence commented 1 month ago

Also see: From HyperAudio Lite Editor Issues: 🔰 Integrate replicate.com WhisperX