NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
https://docs.nvidia.com/nemo-framework/user-guide/latest/overview.html
Apache License 2.0
11.54k stars 2.42k forks source link

duplex_text_normalization_infer.py FST error #4507

Closed uralik closed 2 years ago

uralik commented 2 years ago

Describe the bug

Runtime crashes when running the inference example with T5 pretrained model.

Steps/Code to reproduce bug

python examples/nlp/duplex_text_normalization/duplex_text_normalization_infer.py lang=en mode=tn tagger_pretrained_model=neural_text_normalization_t5 decoder_pretrained_model=neural_text_normalization_t5 inference.from_file=./text_en.txt

Traceback:

[NeMo I 2022-07-06 11:06:49 duplex_text_normalization_infer:83] Running inference on ./text_en.txt...
  0%|                                                                                                                             | 0/1 [00:00<?, ?it/s]ERROR: StringFstToOutputLabels: Invalid start state
Error executing job with overrides: ['lang=en', 'mode=tn', 'tagger_pretrained_model=neural_text_normalization_t5', 'decoder_pretrained_model=neural_text_normalization_t5', 'inference.from_file=./text_en.txt']
An error occurred during Hydra's exception formatting:
AssertionError()
Traceback (most recent call last):
  File "/private/home/kulikov/miniconda3/envs/nemo/lib/python3.8/site-packages/hydra/_internal/utils.py", line 252, in run_and_report
    assert mdl is not None
AssertionError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "examples/nlp/duplex_text_normalization/duplex_text_normalization_infer.py", line 155, in <module>
    main()
  File "/private/home/kulikov/code/NeMo/nemo/core/config/hydra_runner.py", line 104, in wrapper
    _run_hydra(
  File "/private/home/kulikov/miniconda3/envs/nemo/lib/python3.8/site-packages/hydra/_internal/utils.py", line 377, in _run_hydra
    run_and_report(
  File "/private/home/kulikov/miniconda3/envs/nemo/lib/python3.8/site-packages/hydra/_internal/utils.py", line 294, in run_and_report
    raise ex
  File "/private/home/kulikov/miniconda3/envs/nemo/lib/python3.8/site-packages/hydra/_internal/utils.py", line 211, in run_and_report
    return func()
  File "/private/home/kulikov/miniconda3/envs/nemo/lib/python3.8/site-packages/hydra/_internal/utils.py", line 378, in <lambda>
    lambda: hydra.run(
  File "/private/home/kulikov/miniconda3/envs/nemo/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 111, in run
    _ = ret.return_value
  File "/private/home/kulikov/miniconda3/envs/nemo/lib/python3.8/site-packages/hydra/core/utils.py", line 233, in return_value
    raise self._return_value
  File "/private/home/kulikov/miniconda3/envs/nemo/lib/python3.8/site-packages/hydra/core/utils.py", line 160, in run_job
    ret.return_value = task_function(task_cfg)
  File "examples/nlp/duplex_text_normalization/duplex_text_normalization_infer.py", line 91, in main
    new_lines = normalizer_electronic.normalize_list(lines)
  File "/private/home/kulikov/code/NeMo/nemo_text_processing/text_normalization/normalize.py", line 150, in normalize_list
    raise e
  File "/private/home/kulikov/code/NeMo/nemo_text_processing/text_normalization/normalize.py", line 145, in normalize_list
    normalized_texts = Parallel(n_jobs=n_jobs)(
  File "/private/home/kulikov/miniconda3/envs/nemo/lib/python3.8/site-packages/joblib/parallel.py", line 1043, in __call__
    if self.dispatch_one_batch(iterator):
  File "/private/home/kulikov/miniconda3/envs/nemo/lib/python3.8/site-packages/joblib/parallel.py", line 861, in dispatch_one_batch
    self._dispatch(tasks)
  File "/private/home/kulikov/miniconda3/envs/nemo/lib/python3.8/site-packages/joblib/parallel.py", line 779, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/private/home/kulikov/miniconda3/envs/nemo/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 208, in apply_async
    result = ImmediateResult(func)
  File "/private/home/kulikov/miniconda3/envs/nemo/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 572, in __init__
    self.results = batch()
  File "/private/home/kulikov/miniconda3/envs/nemo/lib/python3.8/site-packages/joblib/parallel.py", line 262, in __call__
    return [func(*args, **kwargs)
  File "/private/home/kulikov/miniconda3/envs/nemo/lib/python3.8/site-packages/joblib/parallel.py", line 262, in <listcomp>
    return [func(*args, **kwargs)
  File "/private/home/kulikov/code/NeMo/nemo_text_processing/text_normalization/normalize.py", line 164, in __process_batch
    normalized_lines = [
  File "/private/home/kulikov/code/NeMo/nemo_text_processing/text_normalization/normalize.py", line 165, in <listcomp>
    self.normalize(
  File "/private/home/kulikov/code/NeMo/nemo_text_processing/text_normalization/normalize.py", line 268, in normalize
    tagged_text = self.select_tag(tagged_lattice)
  File "/private/home/kulikov/code/NeMo/nemo_text_processing/text_normalization/normalize.py", line 382, in select_tag
    tagged_text = pynini.shortestpath(lattice, nshortest=1, unique=True).string()
  File "extensions/_pynini.pyx", line 471, in _pynini.Fst.string
  File "extensions/_pynini.pyx", line 516, in _pynini.Fst.string
_pywrapfst.FstOpError: Operation failed

where

cat text_en.txt

he was born in 1992

Expected behavior

The example is expected to work! Looks like smth is wrong with the pynini lib.

Environment overview (please complete the following information)

Environment details

If NVIDIA docker image is used you don't need to specify these. Otherwise, please provide:

Additional context

pynini 2.1.4

ekmb commented 2 years ago

The version looks correct and is expected to work. Could you re-try with a clean 3.8-3.9 conda environment and use the following to install pynini:

#!/bin/bash
if [[ $OSTYPE == 'darwin'* ]]; then
  conda install -c conda-forge -y pynini=2.1.4
else
  pip install pynini==2.1.4
fi
uralik commented 2 years ago

Thanks for response @ekmb . As I wrote in my post my pynini version == 2.1.4 Anyway I followed you suggestion and made another clean conda env, this is the list of installed python packages:

absl-py==1.1.0
aiohttp==3.8.1
aiosignal==1.2.0
alabaster==0.7.12
aniso8601==9.0.1
antlr4-python3-runtime==4.8
appdirs==1.4.4
argon2-cffi==21.3.0
argon2-cffi-bindings==21.2.0
asttokens==2.0.5
async-timeout==4.0.2
attrdict==2.0.1
attrs==21.4.0
audioread==2.1.9
Babel==2.10.3
backcall==0.2.0
beautifulsoup4==4.11.1
black==19.10b0
bleach==5.0.1
boto3==1.24.24
botocore==1.27.24
braceexpand==0.1.7
brotlipy==0.7.0
cachetools==5.2.0
certifi==2022.6.15
cffi @ file:///tmp/build/80754af9/cffi_1636541934635/work
charset-normalizer @ file:///tmp/build/80754af9/charset-normalizer_1630003229654/work
click==8.1.3
colorama==0.4.5
commonmark==0.9.1
cryptography @ file:///tmp/build/80754af9/cryptography_1652083738073/work
cycler==0.11.0
Cython==0.29.30
debugpy==1.6.1
decorator==5.1.1
defusedxml==0.7.1
Distance==0.1.3
docker-pycreds==0.4.0
docopt==0.6.2
docutils==0.18.1
editdistance==0.6.0
einops==0.4.1
entrypoints==0.4
executing==0.8.3
faiss-cpu==1.7.2
fastjsonschema==2.15.3
fasttext==0.9.2
filelock==3.7.1
-e git+https://github.com/flashlight/flashlight@942414d530f19a1fca965499c8a5441405bd3a16#egg=flashlight&subdirectory=bindings/python
Flask==2.1.2
Flask-RESTful==0.3.9
fonttools==4.34.3
frozendict==2.3.2
frozenlist==1.3.0
fsspec==2022.5.0
ftfy==6.1.1
g2p-en==2.1.0
gdown==4.5.1
gitdb==4.0.9
GitPython==3.1.27
google-auth==2.9.0
google-auth-oauthlib==0.4.6
grpcio==1.47.0
h5py==3.7.0
huggingface-hub==0.8.1
hydra-core==1.1.2
idna @ file:///tmp/build/80754af9/idna_1637925883363/work
ijson==3.1.4
imagesize==1.4.1
importlib-metadata==4.12.0
importlib-resources==5.2.3
inflect==5.6.0
iniconfig==1.1.1
ipadic==1.0.0
ipdb==0.13.9
ipykernel==6.15.0
ipython==8.4.0
ipython-genutils==0.2.0
ipywidgets==7.7.1
isort==4.3.21
itsdangerous==2.1.2
jarowinkler==1.1.0
jedi==0.18.1
jieba==0.42.1
Jinja2==3.1.2
jmespath==1.0.1
joblib==1.1.0
jsonschema==4.6.2
jupyter-client==7.3.4
jupyter-core==4.11.0
jupyterlab-pygments==0.2.2
jupyterlab-widgets==1.1.1
kaldi-python-io==1.2.2
kaldiio==2.17.2
kiwisolver==1.4.3
latexcodec==2.0.1
librosa==0.9.2
llvmlite==0.38.0
Markdown==3.3.7
MarkupSafe==2.1.1
marshmallow==3.17.0
matplotlib==3.5.2
matplotlib-inline==0.1.3
mecab-python3==1.0.5
mistune==0.8.4
mkl-fft==1.3.1
mkl-random @ file:///tmp/build/80754af9/mkl_random_1626186064646/work
mkl-service==2.4.0
mpmath==1.2.1
multidict==6.0.2
nbclient==0.6.6
nbconvert==6.5.0
nbformat==5.4.0
-e git+ssh://git@github.com/NVIDIA/NeMo.git@a0e94df5e43e45ff28923e252cd9a1f3a4d9bd75#egg=nemo_toolkit
nest-asyncio==1.5.5
nltk==3.7
notebook==6.4.12
numba @ file:///home/conda/feedstock_root/build_artifacts/numba_1642189978584/work
numpy==1.23.0
oauthlib==3.2.0
omegaconf==2.1.2
onnx==1.12.0
OpenCC==1.1.4
packaging==21.3
pandas==1.4.3
pandocfilters==1.5.0
pangu==4.0.6.1
parameterized==0.8.1
parso==0.8.3
pathspec==0.9.0
pathtools==0.1.2
pesq==0.0.4
pexpect==4.8.0
pickleshare==0.7.5
Pillow==9.0.1
pip-api==0.0.29
pipreqs==0.4.11
pluggy==1.0.0
pooch==1.6.0
portalocker==2.4.0
prometheus-client==0.14.1
promise==2.3
prompt-toolkit==3.0.30
protobuf==3.19.4
psutil==5.9.1
ptyprocess==0.7.0
pure-eval==0.2.2
py==1.11.0
pyannote.core==4.4
pyannote.database==4.1.3
pyannote.metrics==3.2.1
pyasn1==0.4.8
pyasn1-modules==0.2.8
pybind11==2.9.2
pybtex==0.24.0
pybtex-docutils==1.0.2
pycparser @ file:///tmp/build/80754af9/pycparser_1636541352034/work
pyDeprecate==0.3.2
pydub==0.25.1
Pygments==2.12.0
pynini==2.1.4
pyOpenSSL @ file:///opt/conda/conda-bld/pyopenssl_1643788558760/work
pyparsing==3.0.9
pypinyin==0.46.0
pyrsistent==0.18.1
PySocks @ file:///tmp/build/80754af9/pysocks_1605305779399/work
pystoi==0.3.3
pytest==7.1.2
pytest-runner==6.0.0
python-dateutil==2.8.2
pytorch-lightning==1.6.4
pytz==2022.1
PyYAML==5.4.1
pyzmq==23.2.0
rapidfuzz==2.1.2
regex==2022.6.2
requests @ file:///opt/conda/conda-bld/requests_1656438147783/work
requests-oauthlib==1.3.1
resampy==0.3.1
rich==12.4.4
rsa==4.8
ruamel.yaml==0.17.21
ruamel.yaml.clib==0.2.6
s3transfer==0.6.0
sacrebleu==2.1.0
sacremoses==0.0.53
scikit-learn==1.1.1
scipy==1.8.1
Send2Trash==1.8.0
sentence-transformers==2.2.2
sentencepiece==0.1.96
sentry-sdk==1.6.0
setproctitle==1.2.3
shellingham==1.4.0
shortuuid==1.0.9
simplejson==3.17.6
six @ file:///tmp/build/80754af9/six_1644875935023/work
smmap==5.0.0
snowballstemmer==2.2.0
sortedcontainers==2.4.0
SoundFile==0.10.3.post1
soupsieve==2.3.2.post1
sox==1.4.1
Sphinx==5.0.2
sphinxcontrib-applehelp==1.0.2
sphinxcontrib-bibtex==2.4.2
sphinxcontrib-devhelp==1.0.2
sphinxcontrib-htmlhelp==2.0.0
sphinxcontrib-jsmath==1.0.1
sphinxcontrib-qthelp==1.0.3
sphinxcontrib-serializinghtml==1.1.5
stack-data==0.3.0
sympy==1.10.1
tabulate==0.8.10
tensorboard==2.9.1
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
terminado==0.15.0
threadpoolctl==3.1.0
tinycss2==1.1.1
tokenizers==0.12.1
toml==0.10.2
tomli==2.0.1
torch==1.12.0
torchaudio==0.12.0
torchmetrics==0.9.2
torchvision==0.13.0
tornado==6.2
tqdm==4.64.0
traitlets==5.3.0
transformers==4.20.1
typed-ast==1.5.4
typer==0.5.0
typing_extensions @ file:///opt/conda/conda-bld/typing_extensions_1647553014482/work
Unidecode==1.3.4
urllib3 @ file:///opt/conda/conda-bld/urllib3_1650639997961/work
wandb==0.12.21
wcwidth==0.2.5
webdataset==0.1.62
webencodings==0.5.1
Werkzeug==2.1.2
wget==3.2
widgetsnbextension==3.6.1
wrapt==1.14.1
yarg==0.1.9
yarl==1.7.2
youtokentome==1.0.6
zipp==3.8.0

The command still returns same error, here is the full output I get in the terminal:

[NeMo W 2022-07-07 10:29:38 optimizers:55] Apex was not found. Using the lamb or fused_adam optimizer will error out.
[NeMo W 2022-07-07 10:29:39 experimental:27] Module <class 'nemo.collections.nlp.data.language_modeling.megatron.megatron_batch_samplers.MegatronPretrainingRandomBatchSampler'> is experimental, not ready for production and is not fully supported. Use at your own risk.
[NeMo W 2022-07-07 10:29:40 experimental:27] Module <class 'nemo.collections.nlp.models.text_normalization_as_tagging.thutmose_tagger.ThutmoseTaggerModel'> is experimental, not ready for production and is not fully supported. Use at your own risk.
Using 16bit native Automatic Mixed Precision (AMP)
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
[NeMo I 2022-07-07 10:29:40 helpers:63] Loading pretrained model neural_text_normalization_t5
[NeMo I 2022-07-07 10:29:40 cloud:56] Found existing object /private/home/kulikov/.cache/torch/NeMo/NeMo_1.11.0rc0/neural_text_normalization_t5_tagger/e48ca3e94a215c727d6d325ad2fe0f99/neural_text_normalization_t5_tagger.nemo.
[NeMo I 2022-07-07 10:29:40 cloud:62] Re-using file from: /private/home/kulikov/.cache/torch/NeMo/NeMo_1.11.0rc0/neural_text_normalization_t5_tagger/e48ca3e94a215c727d6d325ad2fe0f99/neural_text_normalization_t5_tagger.nemo
[NeMo I 2022-07-07 10:29:40 common:789] Instantiating model from pre-trained checkpoint
Created a temporary directory at /tmp/tmpo6wfprba
Writing /tmp/tmpo6wfprba/_remote_module_non_scriptable.py
[NeMo W 2022-07-07 10:29:43 nlp_overrides:223] Apex was not found. Please see the NeMo README for installation instructions: https://github.com/NVIDIA/apex
    Megatron-based models require Apex to function correctly.
Some weights of the model checkpoint at albert-base-v2 were not used when initializing AlbertForTokenClassification: ['predictions.decoder.bias', 'predictions.dense.bias', 'predictions.LayerNorm.bias', 'predictions.decoder.weight', 'predictions.dense.weight', 'predictions.LayerNorm.weight', 'predictions.bias']
- This IS expected if you are initializing AlbertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing AlbertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of AlbertForTokenClassification were not initialized from the model checkpoint at albert-base-v2 and are newly initialized: ['classifier.weight', 'classifier.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[NeMo W 2022-07-07 10:29:44 nemo_logging:349] /private/home/kulikov/miniconda3/envs/nemo2/lib/python3.8/site-packages/torchmetrics/utilities/prints.py:36: UserWarning: Torchmetrics v0.9 introduced a new argument class property called `full_state_update` that has
                    not been set for this class (ClassificationReport). The property determines if `update` by
                    default needs access to the full metric state. If this is not the case, significant speedups can be
                    achieved and we recommend setting this to `False`.
                    We provide an checking function
                    `from torchmetrics.utilities import check_forward_no_full_state`
                    that can be used to check if the `full_state_update=True` (old and potential slower behaviour,
                    default for now) or if `full_state_update=False` can be used safely.

      warnings.warn(*args, **kwargs)

[NeMo I 2022-07-07 10:29:45 save_restore_connector:243] Model DuplexTaggerModel was successfully restored from /private/home/kulikov/.cache/torch/NeMo/NeMo_1.11.0rc0/neural_text_normalization_t5_tagger/e48ca3e94a215c727d6d325ad2fe0f99/neural_text_normalization_t5_tagger.nemo.
[NeMo I 2022-07-07 10:29:45 helpers:99] Model tagger -- Device cuda:0
GPU available: True, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
[NeMo W 2022-07-07 10:29:45 nemo_logging:349] /private/home/kulikov/miniconda3/envs/nemo2/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py:1814: PossibleUserWarning: GPU available but not used. Set `accelerator` and `devices` using `Trainer(accelerator='gpu', devices=2)`.
      rank_zero_warn(

`Trainer(val_check_interval=1.0)` was configured so validation will run at the end of the training epoch..
[NeMo I 2022-07-07 10:29:45 helpers:63] Loading pretrained model neural_text_normalization_t5
[NeMo I 2022-07-07 10:29:45 cloud:56] Found existing object /private/home/kulikov/.cache/torch/NeMo/NeMo_1.11.0rc0/neural_text_normalization_t5_decoder/9261ad441d44e228c624bda7cd6c94ac/neural_text_normalization_t5_decoder.nemo.
[NeMo I 2022-07-07 10:29:45 cloud:62] Re-using file from: /private/home/kulikov/.cache/torch/NeMo/NeMo_1.11.0rc0/neural_text_normalization_t5_decoder/9261ad441d44e228c624bda7cd6c94ac/neural_text_normalization_t5_decoder.nemo
[NeMo I 2022-07-07 10:29:45 common:789] Instantiating model from pre-trained checkpoint
[NeMo W 2022-07-07 10:29:49 nemo_logging:349] /private/home/kulikov/miniconda3/envs/nemo2/lib/python3.8/site-packages/transformers/models/t5/tokenization_t5_fast.py:156: FutureWarning: This tokenizer was incorrectly instantiated with a model max length of 512 which will be corrected in Transformers v5.
    For now, this behavior is kept to avoid breaking backwards compatibility when padding/encoding with `truncation is True`.
    - Be aware that you SHOULD NOT rely on t5-small automatically truncating your input to 512 when padding/encoding.
    - If you want to encode/pad to sequences longer than 512 you can either instantiate this tokenizer with `model_max_length` or pass `max_length` when encoding/padding.
    - To avoid this warning, please instantiate this tokenizer with `model_max_length` set to your preferred value.
      warnings.warn(

[NeMo W 2022-07-07 10:29:49 modelPT:156] If you intend to do validation, please call the ModelPT.setup_validation_data() or ModelPT.setup_multiple_validation_data() method and provide a valid configuration file to setup the validation data loader(s). 
    Validation config : 
    data_path: /WS/text_norm/data/test_joint_punct_no5plainletterurlverbatim_dash_electronic.tsv
    batch_size: 64
    shuffle: false
    max_insts: -1
    use_cache: true
    num_workers: 3
    pin_memory: false
    drop_last: false

[NeMo W 2022-07-07 10:29:49 nlp_overrides:223] Apex was not found. Please see the NeMo README for installation instructions: https://github.com/NVIDIA/apex
    Megatron-based models require Apex to function correctly.
[NeMo I 2022-07-07 10:29:51 save_restore_connector:243] Model DuplexDecoderModel was successfully restored from /private/home/kulikov/.cache/torch/NeMo/NeMo_1.11.0rc0/neural_text_normalization_t5_decoder/9261ad441d44e228c624bda7cd6c94ac/neural_text_normalization_t5_decoder.nemo.
[NeMo I 2022-07-07 10:29:51 helpers:99] Model decoder -- Device cuda:0
[NeMo I 2022-07-07 10:29:53 tokenize_and_classify:87] Creating ClassifyFst grammars.
[NeMo I 2022-07-07 10:30:16 tokenize_and_classify:61] Creating ClassifyFst grammars.
[NeMo I 2022-07-07 10:30:20 tokenize_and_classify:87] Creating ClassifyFst grammars.
[NeMo I 2022-07-07 10:30:43 tokenize_and_classify:70] Creating ClassifyFst grammars.
[NeMo I 2022-07-07 10:30:47 duplex_text_normalization_infer:83] Running inference on ./text_en.txt...

  0%|          | 0/1 [00:00<?, ?it/s]ERROR: StringFstToOutputLabels: Invalid start state
Error executing job with overrides: ['lang=en', 'mode=tn', 'tagger_pretrained_model=neural_text_normalization_t5', 'decoder_pretrained_model=neural_text_normalization_t5', 'inference.from_file=./text_en.txt']
An error occurred during Hydra's exception formatting:
AssertionError()
Traceback (most recent call last):
  File "/private/home/kulikov/miniconda3/envs/nemo2/lib/python3.8/site-packages/hydra/_internal/utils.py", line 252, in run_and_report
    assert mdl is not None
AssertionError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "examples/nlp/duplex_text_normalization/duplex_text_normalization_infer.py", line 155, in <module>
    main()
  File "/private/home/kulikov/code/NeMo/nemo/core/config/hydra_runner.py", line 104, in wrapper
    _run_hydra(
  File "/private/home/kulikov/miniconda3/envs/nemo2/lib/python3.8/site-packages/hydra/_internal/utils.py", line 377, in _run_hydra
    run_and_report(
  File "/private/home/kulikov/miniconda3/envs/nemo2/lib/python3.8/site-packages/hydra/_internal/utils.py", line 294, in run_and_report
    raise ex
  File "/private/home/kulikov/miniconda3/envs/nemo2/lib/python3.8/site-packages/hydra/_internal/utils.py", line 211, in run_and_report
    return func()
  File "/private/home/kulikov/miniconda3/envs/nemo2/lib/python3.8/site-packages/hydra/_internal/utils.py", line 378, in <lambda>
    lambda: hydra.run(
  File "/private/home/kulikov/miniconda3/envs/nemo2/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 111, in run
    _ = ret.return_value
  File "/private/home/kulikov/miniconda3/envs/nemo2/lib/python3.8/site-packages/hydra/core/utils.py", line 233, in return_value
    raise self._return_value
  File "/private/home/kulikov/miniconda3/envs/nemo2/lib/python3.8/site-packages/hydra/core/utils.py", line 160, in run_job
    ret.return_value = task_function(task_cfg)
  File "examples/nlp/duplex_text_normalization/duplex_text_normalization_infer.py", line 91, in main
    new_lines = normalizer_electronic.normalize_list(lines)
  File "/private/home/kulikov/code/NeMo/nemo_text_processing/text_normalization/normalize.py", line 150, in normalize_list
    raise e
  File "/private/home/kulikov/code/NeMo/nemo_text_processing/text_normalization/normalize.py", line 145, in normalize_list
    normalized_texts = Parallel(n_jobs=n_jobs)(
  File "/private/home/kulikov/miniconda3/envs/nemo2/lib/python3.8/site-packages/joblib/parallel.py", line 1043, in __call__
    if self.dispatch_one_batch(iterator):
  File "/private/home/kulikov/miniconda3/envs/nemo2/lib/python3.8/site-packages/joblib/parallel.py", line 861, in dispatch_one_batch
    self._dispatch(tasks)
  File "/private/home/kulikov/miniconda3/envs/nemo2/lib/python3.8/site-packages/joblib/parallel.py", line 779, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/private/home/kulikov/miniconda3/envs/nemo2/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 208, in apply_async
    result = ImmediateResult(func)
  File "/private/home/kulikov/miniconda3/envs/nemo2/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 572, in __init__
    self.results = batch()
  File "/private/home/kulikov/miniconda3/envs/nemo2/lib/python3.8/site-packages/joblib/parallel.py", line 262, in __call__
    return [func(*args, **kwargs)
  File "/private/home/kulikov/miniconda3/envs/nemo2/lib/python3.8/site-packages/joblib/parallel.py", line 262, in <listcomp>
    return [func(*args, **kwargs)
  File "/private/home/kulikov/code/NeMo/nemo_text_processing/text_normalization/normalize.py", line 164, in __process_batch
    normalized_lines = [
  File "/private/home/kulikov/code/NeMo/nemo_text_processing/text_normalization/normalize.py", line 165, in <listcomp>
    self.normalize(
  File "/private/home/kulikov/code/NeMo/nemo_text_processing/text_normalization/normalize.py", line 268, in normalize
    tagged_text = self.select_tag(tagged_lattice)
  File "/private/home/kulikov/code/NeMo/nemo_text_processing/text_normalization/normalize.py", line 382, in select_tag
    tagged_text = pynini.shortestpath(lattice, nshortest=1, unique=True).string()
  File "extensions/_pynini.pyx", line 471, in _pynini.Fst.string
  File "extensions/_pynini.pyx", line 516, in _pynini.Fst.string
_pywrapfst.FstOpError: Operation failed

  0%|          | 0/1 [00:02<?, ?it/s]

So something is not right there for sure. Does it work well on your side?

ekmb commented 2 years ago

@uralik, I made a few updates with https://github.com/NVIDIA/NeMo/pull/4517, could you try running from duplex_merge_fix branch?

uralik commented 2 years ago

It works !! =)

thanks for such a quick fix! Since we are on this topic now, may i ask if you ever planned to release the Ru pretrained model as well?

ekmb commented 2 years ago

I'm glad it worked for you! No, we don't have immediate plans to release the Ru Duplex checkpoint, but all the training scripts should work with Ru Google data out-of-box. For ITN we also have a new tagger-based model, TN variant is WIP.