readbeyond / aeneas

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
http://www.readbeyond.it/aeneas/
GNU Affero General Public License v3.0
2.53k stars 233 forks source link

Word Level Granularity For Python API #292

Open Errorbot1122 opened 2 years ago

Errorbot1122 commented 2 years ago

In the CLI Interface, you can turn on word level granularity by adding --presets-word, but I can't seem to find any function/property that turns that on in the Python Task Interface

zxul767 commented 1 year ago

I achieved it using the following:

config = TaskConfiguration()
config[gc.PPN_TASK_IS_TEXT_FILE_FORMAT] = TextFileFormat.MPLAIN
# other configurations...

rconf = RuntimeConfiguration()
rconf[RuntimeConfiguration.MFCC_MASK_NONSPEECH] = True
# L3 represents word granularity
rconf[RuntimeConfiguration.MFCC_MASK_NONSPEECH_L3] = True
rconf[RuntimeConfiguration.TTS_CACHE] = True
rconf.set_granularity(3)  # word-level granularity

task = Task()
task.configuration = config

ExecuteTask(task, rconf=rconf).execute()
task.output_sync_map_file()