NVIDIA / NeMo-text-processing

NeMo text processing for ASR and TTS
https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/nlp/text_normalization/wfst/wfst_text_normalization.html
Apache License 2.0
242 stars 77 forks source link

Question: How does the logic for TimeFst for En work? #104

Closed ChinmayPatil11 closed 8 months ago

ChinmayPatil11 commented 9 months ago

Hi. Sorry if this is a basic question, but I am a beginner with pynini and confused with the logic behind the TimeFst in En inverse text normalization.

In the file nemo_text_processing/inverse_text_normalization/en/taggers/time.py, the time components are tagged into minutes and hours as required e.g. twelve thirty -> time { hours: "12" minutes: "30" } e.g. twelve past one -> time { minutes: "12" hours: "1" }

In the file nemo_text_processing/inverse_text_normalization/en/verbalizers/time.py, the same tagged string are verbalized and we remain only with the time eg. time { hours: "12" minutes: "30" } -> 12:30

I am unable to understand how the second case from tagged sentence -> 'time { minutes: "12" hours: "1" }' is handled in the code. Is it by reversing the terms while processing or is it done during the final processing in FinalVerbFst?

Would be glad if anyone could help. Thank you!

anand-nv commented 8 months ago

Check the reordering logic in https://github.com/NVIDIA/NeMo-text-processing/blob/17f7aaa7620b7922197c3b041611d5c8636da050/nemo_text_processing/text_normalization/normalize.py#L347