lmmx / tap

dex ⠶ tap – an audio transcriber for web radio
MIT License
1 stars 0 forks source link

Online segmentation and transcription for live streams #18

Open lmmx opened 3 years ago

lmmx commented 3 years ago

[Motivated by a feature request to handle live streams in beeb, placeholder notes for when I'm ready to implement transcription from live streams following this]

Speaker segmentation is the only part that complicates the workflow here.

Essentially there's an "offline" segmentation (i.e. done after the show has ended), but we could switch to an "online" segmentation:

[TBC]

Due to a limitation in the models I'm using (maximum token sequence length) I can't actually input an entire programme to these steps. In that sense there's no benefit gained from waiting until a programme finishes to build the MP4.

If I implemented live transcription, I'd get the end result much sooner (as I could begin processing the audio while the show was still on air), so I'd be interested in this too.