Open anuejn opened 1 year ago
IMO the better way for now would be to split the tasks into smaller tasks. For example we already agreed that we should split the current ALIGN
into two: one task that only does alignment and only depends on the transcript. And a second task that applies the diarization results to the re-aligned segments.
I fear that running multiple tasks in parallel that depend on each other could introduce a lot of complexity for relatively small gains. Some things we would maybe need (just brainstorming)
IMO a solution might be for tasks to schedule their dependants (e.g. the transcription task spawning new alignment tasks), which is not the fastest solution and also poses some new UI/UX problems. But might work safer with the current architecture
Maybe it makes sense to make the dependencies more granular. E.g. the alignment could start aligning to the already transcribed bars (maybe with some safety distance) even if not the whole document is transcribed yet. This could also lead to some cool UX, where we really display one worker for each task (similiar to normal users) and each worker can report their own status