NVIDIA / NeMo-text-processing

NeMo text processing for ASR and TTS
https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/nlp/text_normalization/wfst/wfst_text_normalization.html
Apache License 2.0
242 stars 76 forks source link

bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162

Closed hannan72 closed 1 month ago

hannan72 commented 2 months ago

In the ./nemo_text_processing/inverse_text_normalization/zh/graph_utils.py line 79, load_labels() method is called but it is not imported, So it raises error. It could be simply resolved by adding the following method in ./nemo_text_processing/inverse_text_normalization/zh/utils.py:

def load_labels(abs_path):
    """
    loads relative path file as dictionary

    Args:
        abs_path: absolute path

    Returns dictionary of mappings
    """
    with open(abs_path, encoding="utf-8") as label_tsv:
        labels = list(csv.reader(label_tsv, delimiter="\t"))
    return labels

and import it in ./nemo_text_processing/inverse_text_normalization/zh/graph_utils.py:

  from nemo_text_processing.inverse_text_normalization.zh.utils import load_labels

Also there is another bug in arabic TN tagger: In nemo_text_processing/text_normalization/ar/taggers/decimal.py line 40, quantities is not defined. So it could be resolved by adding the following line before it (in line 33):

quantities = pynini.string_file(get_abs_path("data/numbers/quantities.tsv"))
ekmb commented 2 months ago

Thanks for reporting this @hannan72!

@BuyuanCui could you add the fix for zh in PR112?

@mgrafu could you take a look at ar issue?

BuyuanCui commented 2 months ago

Would update accordingly.

BuyuanCui commented 2 months ago

Updates made

ekmb commented 1 month ago

@hannan72 closing this one as fixed, please re-open if the error persists.