readbeyond / aeneas

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
http://www.readbeyond.it/aeneas/
GNU Affero General Public License v3.0
2.53k stars 233 forks source link

How to use hindi language in aeneas? #277

Open YashMakan opened 3 years ago

YashMakan commented 3 years ago

Hi, I want to align a Hindi audio file and a plain text file in Hindi Devanagari using aeneas. How can I do that? I am using the following command after some research: "python -m aeneas.tools.execute_task hindi_audio.mp3 hindi_text.txt "task_language=hin|os_task_file_format=json|is_text_type=plain" map.json" but it is throwing me this error:

[WARN] Unable to load Python C Extensions
[WARN] Running the slower pure Python code
[WARN] See the documentation for directions to compile the Python C Extensions
[INFO] Validating config string (specify --skip-validator to bypass)...
[INFO] Validating config string... done
[INFO] Creating task...
[INFO] Creating task... done
[INFO] Executing task...
[ERRO] An unexpected error occurred while executing the task:
[ERRO] Unexpected error while executing task : Both the C extension and the pure Python code failed. (Wrong arguments? Input too big?)

kindly help... Thanks

yasntrk commented 3 years ago

If your text file contains "-" characters like that, the aligner can give some errors like that.