hyperaudio / hyperaudio-lite-editor

A lightweight transcript editor for editing and correcting STT generated timed transcripts
GNU Affero General Public License v3.0
34 stars 10 forks source link

Read write Gentle JSON #160

Open maboa opened 1 year ago

maboa commented 1 year ago

@natelawrence I would love it if you added read/write support for Gentle alignments and YouTube Automatic Captions XML/VTT formats that contain per-word timestamps.

I'd like to put a $150 bounty on both: Hyperaudio editor reads/writes Gentle JSON. Hyperaudio editor reads/writes YT AC XML.

https://twitter.com/natelawrence/status/1653373649758605313?s=20

natelawrence commented 3 months ago

For context, Gentle is a forced aligner that (provided an audio file and a transcript) will generate per-word (and even sub-word) timecodes. Their source code is available here on Github.

🔰 I'm attaching two example Gentle JSON files (for the same recording) below. ASR Timed Text Format Test 2 [Gentle] P.json

This second JSON file uses homophone substitutions for words that are not in Gentle's pronunciation dictionary in order to acquire more reliable phoneme timecodes for more words. ASR Timed Text Format Test 2 [Gentle] H.json

The transcripts have some minor custom markup. [+] = the beginning of a sentence \\ = the end of a sentence || = the end of a paragraph

The corresponding audio file can be obtained here.

natelawrence commented 3 months ago

Also note that although HyperAudio Converter is able to import Gentle files and convert them to HyperAudio hypertranscripts, the formatting of the HyperAudio hypertranscript is outdated and cannot be directly imported by HyperAudio Lite Editor.