NVIDIA / NeMo-text-processing

NeMo text processing for ASR and TTS
https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/nlp/text_normalization/wfst/wfst_text_normalization.html
Apache License 2.0
246 stars 80 forks source link

German TN: normalized numbers wrongly include spaces #64

Closed eginhard closed 1 year ago

eginhard commented 1 year ago

Describe the bug

In German, numbers are currently normalized with spaces between each digit and unit, although these should normally be written without spaces. In TTS systems, this leads to unnatural pauses in the output.

Steps/Code to reproduce bug

  1. Normalize "18940722"
  2. Output is "achtzehn millionen neun hundert vierzig tausend sieben hundert zwei und zwanzig", see https://github.com/NVIDIA/NeMo-text-processing/blob/9afbaf20c569ed41567988956e0e90dc913dd516/tests/nemo_text_processing/de/data_text_normalization/test_cases_cardinal.txt#L48

Expected behavior

Output should be "achtzehn millionen neunhundertvierzigtausendsiebenhundertzweiundzwanzig" (spaces are introduced for millions and above).

Environment details

NVIDIA NeMo Text Processing 0.1.7rc0

yzhang123 commented 1 year ago

@eginhard thanks for addressing this! We are aware of this. for our initial German version we did not tackle this feature. However, if you want this for your tts model to adjust pauses based on space, be aware that when pronouncing neunhundertvierzigtausendsiebenhundertzweiundzwanzig , there are short pauses between neunhundert, vierzigtausend,

github-actions[bot] commented 1 year ago

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] commented 1 year ago

This issue was closed because it has been inactive for 7 days since being marked as stale.