readbeyond / aeneas

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
http://www.readbeyond.it/aeneas/
GNU Affero General Public License v3.0
2.45k stars 218 forks source link

Help: are there ways to align boundary of voice to boundary of sentences? #219

Closed jarodtang closed 5 years ago

jarodtang commented 5 years ago

Hi there,

I got two sentences, and found part of sentence 2's voice be aligned to sentence 1, are there any ways to force align group of voice accordingly?

Regars, Jarod

q2

readbeyond commented 5 years ago

Your "group2" onset is weak, hence I guess it is detected as silence because it does not pass the (default) energy threshold of the VAD.

See the VAD_* parameters of the RuntimeConfiguration: https://www.readbeyond.it/aeneas/docs/runtimeconfiguration.html#aeneas.runtimeconfiguration.RuntimeConfiguration.VAD_EXTEND_SPEECH_INTERVAL_AFTER

In alternative, you can try reducing the default MFCC window shift, and see if it helps.

jarodtang commented 5 years ago

I tried both VAD and MFCC, which didn't solve the problem.

rconf[RuntimeConfiguration.VAD_LOG_ENERGY_THRESHOLD] = 0.100