bugbakery / audapolis

an editor for spoken-word audio with automatic transcription
GNU Affero General Public License v3.0
1.63k stars 38 forks source link

Discussion: Multi-Track Support #409

Open pajowu opened 1 year ago

pajowu commented 1 year ago

This issues contains the discussion for supporting multiple tracks.

In my eyes, supporting multi-tracks has multiple facets:

Purpose

What are multi-track recordings used for / what use-cases do we want to focus in?

Import

At the moment we diarize the imported audio and to speaker-detection on this. For multi-track files, this might not be needed. It could be replaced with splitting the audio at speaker turns (i.e. one track goes silent, the other stops being silent) or similar. The tracks could also provide speaker identification

Editing

I'm not sure how editing true multi-track projects would work: Should we just "flatten" them to our current format? How do we deal with overlapping segments? Should we display them in a special way to make clear that those are two tracks running in parallel (maybe split the editor left/right)? We might need to do a few designs first

Export

We already kind-of have multi-track export using otio which creates one track per speaker. I'm not sure what additional formats we should support

jasontucker commented 1 month ago

Agreed, this feature is an important one to me. I record multitrack and would love to have a transcription of the conversation from the combined conversation. Each person is in its own track so you know who is talking for each track. Macwhisper supports this but is having a hard time handling it well. https://goodsnooze.gumroad.com/l/macwhisper