readbeyond / aeneas

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
http://www.readbeyond.it/aeneas/
GNU Affero General Public License v3.0
2.45k stars 218 forks source link

Cannot set dtw algorithm to exact #225

Closed tomassykora closed 4 years ago

tomassykora commented 5 years ago

Hi, I'm trying to find best parameters for our long audio files with long silence periods. I wanted to try exact dtw algorithm, but although I set dtw_algorithm=exact, the output during the process is always Requested algorithm: 'stripe'. Is it a bug or aren't my settings correct? This is the used python code:

config_string = u"task_language=eng|is_text_type=plain|os_task_file_format=json|dtw_algorithm=exact|mfcc_mask_nonspeech=True"
task = Task(config_string=config_string)
task.text_file = textfile

with tempfile.TemporaryDirectory() as tmp_dir:
    with tempfile.NamedTemporaryFile(dir=tmp_dir) as input_file, \
            tempfile.NamedTemporaryFile(dir=tmp_dir) as output_file:
        input_file.write(self.recording.audio('wav').read())
        task.audio_file_path_absolute = input_file.name
        task.sync_map_file_path_absolute = output_file.name

        aeneas_logger = Logger(tee=True)
        ExecuteTask(task, logger=aeneas_logger).execute()
        task.output_sync_map_file()
readbeyond commented 5 years ago

Hi,

you are mixing TaskConfiguration parameters and RuntimeConfiguration parameters.

In particular, "dtw_algorithm" and "mfcc_mask_nonspeech" are RuntimeConfiguration parameters.

In the command line:

python -m aeneas.tools.execute_task audio.mp3 text.txt "task_language=eng|is_text_type=subtitles|os_task_file_format=srt" output.srt -r="dtw_algorithm=exact|mfcc_mask_nonspeech=True"

If you add "-v" you will see the full log.

When using aeneas as a library, you might want to:

your_rconf = RuntimeConfiguration(config_string="dtw_algorithm=exact|mfcc_mask_nonspeech=True")

and then pass the rconf instance to e.g.

ExecuteTask(task=your_task, rconf=your_rconf, logger=your_logger)

or other methods that accept the rconf argument.

HTH,

Alberto Peettarin

On 1/31/19 6:21 PM, Tomáš Sýkora wrote:

|config_string = u"task_language=eng|is_text_type=plain|os_task_file_format=json|dtw_algorithm=exact|mfcc_mask_nonspeech=True" task = Task(config_string=config_string) task.text_file = textfile with tempfile.TemporaryDirectory() as tmp_dir: with tempfile.NamedTemporaryFile(dir=tmp_dir) as input_file, \ tempfile.NamedTemporaryFile(dir=tmp_dir) as output_file: input_file.write(self.recording.audio('wav').read()) task.audio_file_path_absolute = input_file.name task.sync_map_file_path_absolute = output_file.name aeneas_logger = Logger(tee=True) ExecuteTask(task, logger=aeneas_logger).execute() task.output_sync_map_file() |