intel / openvino-plugins-ai-audacity

A set of AI-enabled effects, generators, and analyzers for Audacity®.
GNU General Public License v3.0
732 stars 43 forks source link

Enhancement request: after translation clean up text position #131

Closed ShamanTcler closed 1 month ago

ShamanTcler commented 2 months ago

You have done great work .... but there are always thing to do better.

This is a sample run: Aud

There is a bit of silence before the first word "Welcome" which seems to really throw the alignment off.

Playing with it I did find I could move things around using the "()" markers. Are these markers documented somewhere?

RyanMetcalfeInt8 commented 2 months ago

Hi @ShamanTcler,

Thanks for the feedback!

Just to double check -- is this with 'Max segment length' set to 1? I see you have individual words there.

I have observed this timestamp misalignment issue as well, when there is a long duration of no speaking at the beginning of the track. I'm not sure if there's too much we'll be able to do about it, but we can poke around whisper.cpp discussions to see if anyone has a good solution for that.

If you were to, say, start your selection at ~1.15 second mark, do you get more accurate transcriptions?

Thanks, Ryan

RyanMetcalfeInt8 commented 2 months ago

Playing with it I did find I could move things around using the "()" markers. Are these markers documented somewhere?

Perhaps -- @petersampsonaudacity, is there Audacity documentation about manipulation of label markers from the GUI someplace? Thanks!