NVIDIA / NeMo-text-processing

NeMo text processing for ASR and TTS
https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/nlp/text_normalization/wfst/wfst_text_normalization.html
Apache License 2.0
242 stars 76 forks source link

[zh] WARNING:NeMo-text-processing:Failed text: 免除GOOGLE在一桩诽谤官司中的法律责任。Key: integer_part Value: None #175

Open XuesongYang opened 1 month ago

XuesongYang commented 1 month ago

Received warning message when normalizing text. Could you pls provide what the message indicates?

Reproduciple code:

from nemo_text_processing.text_normalization.normalize import Normalizer
text_normalizer = Normalizer(lang="zh", input_case="cased", overwrite_cache=True, cache_dir=str("cache_dir"))
text_normalizer_call_kwargs = {"punct_pre_process": True, "punct_post_process": True}
normalizer_call = lambda x: text_normalizer.normalize(x, **text_normalizer_call_kwargs)

text = "免除GOOGLE在一桩诽谤官司中的法律责任。"
normed_text = normalizer_call(text)
print(normed_text)

Output

NeMo-text-processing :: INFO     :: Created cache_dir[/zh_tn_True_deterministic__tokenize.far](http://localhost:8889/zh_tn_True_deterministic__tokenize.far)
INFO:NeMo-text-processing:Created cache_dir[/zh_tn_True_deterministic__tokenize.far](http://localhost:8889/zh_tn_True_deterministic__tokenize.far)
 NeMo-text-processing :: WARNING  :: Failed text: 免除GOOGLE在一桩诽谤官司中的法律责任。Key: integer_part Value: None
WARNING:NeMo-text-processing:Failed text: 免除GOOGLE在一桩诽谤官司中的法律责任。Key: integer_part Value: None
免除GOOGLE在一桩诽谤官司中的法律责任。
ekmb commented 1 month ago

@BuyuanCui, could you please take a look?

BuyuanCui commented 1 month ago

Investigating.

github-actions[bot] commented 3 days ago

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.