Closed ahazned closed 1 year ago
Hi,
nemo_text_processing/fst_alignment/alignment.py works fine for TN case when we are aligning input to output words:
inp string: |1994| out string: |tokens { date { year: "nineteen ninety four" } }| inp indices: [0:4] out indices: [23:43] in: |1994| out: |nineteen ninety four|
But in ITN case, the alignment seems to be broken as the input words that are inverse normalized are mapped to empty strings:
inp string: |nineteen ninety four| out string: |tokens { date { year: "1994" preserve_order: true } }| inp indices: [0:8] out indices: [23:23] in: |nineteen| out: || inp indices: [9:15] out indices: [25:25] in: |ninety| out: || inp indices: [16:20] out indices: [26:26] in: |four| out: ||
Is there a way to get below form in ITN case?
in: |nineteen ninety four| out: |1994|
Thank you very much.
https://github.com/NVIDIA/NeMo-text-processing/pull/47
Hi,
nemo_text_processing/fst_alignment/alignment.py works fine for TN case when we are aligning input to output words:
But in ITN case, the alignment seems to be broken as the input words that are inverse normalized are mapped to empty strings:
Is there a way to get below form in ITN case?
in: |nineteen ninety four| out: |1994|
Thank you very much.