Closed lifeiteng closed 2 months ago
@BuyuanCui could you please take a look?
This seems to be related to the existing TN bug. It was not able to process a whole sentence. It will be fixed with the PR that I'm working.
@lifeiteng a few options to speed up:
--cache_dir
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.
这似乎与现有的 TN 错误有关。它无法处理整个句子。它将通过我正在工作的 PR 修复。
This seems to be related to the existing TN bug. It was not able to process a whole sentence. It will be fixed with the PR that I'm working.
Whether the relevant problem has been solved? There are still problems in version 0.3.0
I've found that the TN FST is slow regardless of language (English too). It is not very practical with large data even using multiprocessing (normalize_list()). Any other ways to speed it up?
@riqiang-dp we recommend Sparrowhawk for deployment https://docs.nvidia.com/nemo-framework/user-guide/latest/nemotoolkit/nlp/text_normalization/wfst/wfst_text_processing_deployment.html
one simple zh-CN sentence costs
1.32 sec
and the result is not right.