mozilla / DSAlign

DeepSpeech based forced alignment tool
Mozilla Public License 2.0
235 stars 33 forks source link

aligned output json doesn't have unicode text #10

Closed tensorfoo closed 4 years ago

tensorfoo commented 5 years ago

Instead of showing unicode aligned text strings, it has for example strings that look like "\uXXXX\uYYYY....". Any way to tell the tool to output unicode correctly?

edit. Something like "".join(s) is the desired output instead of the s in the output.

tilmankamp commented 5 years ago

One could add ensure_ascii=False to the json.dump calls.

tensorfoo commented 5 years ago

One could add ensure_ascii=False to the json.dump calls.

That worked. Thank you. Should that have been the default behaviour?

tilmankamp commented 4 years ago

Covered by #32