NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
https://docs.nvidia.com/nemo-framework/user-guide/latest/overview.html
Apache License 2.0
10.92k stars 2.28k forks source link

Error converting nemo file to riva #2743

Closed peterhanlon closed 2 years ago

peterhanlon commented 2 years ago

Describe the bug

When I attempt to convert a nemo file (fine tuned from the conformer model), I get the following error

nemo2riva --out moneypenny-confirmer-v1.riva /data/models/moneypenny-confirmer-v1.nemo
[NeMo W 2021-08-28 08:49:32 optimizers:47] Apex was not found. Using the lamb optimizer will error out.
INFO: Logging level set to 20
INFO: Restoring NeMo model from '/data/models/moneypenny-confirmer-v1.nemo'
ERROR: Nemo2Jarvis: Failed to restore model from NeMo file : /data/models/moneypenny-confirmer-v1.nemo. Please make sure you have the latest NeMo package installed with [all] dependencies.
Traceback (most recent call last):
  File "/home/ubuntu/anaconda3/envs/riva/bin/nemo2riva", line 8, in <module>
    sys.exit(nemo2riva())
  File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/site-packages/nemo2riva/cli/nemo2riva.py", line 50, in nemo2riva
    Nemo2Riva(args)
  File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/site-packages/nemo2riva/convert.py", line 42, in Nemo2Riva
    raise e
  File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/site-packages/nemo2riva/convert.py", line 35, in Nemo2Riva
    model = ModelPT.restore_from(restore_path=nemo_in)
  File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/site-packages/nemo/core/classes/modelPT.py", line 270, in restore_from
    instance = cls._save_restore_connector.restore_from(
  File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/site-packages/nemo/core/connectors/save_restore_connector.py", line 136, in restore_from
    instance = calling_cls.from_config_dict(config=conf)
  File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/site-packages/nemo/core/classes/common.py", line 473, in from_config_dict
    instance = cls(cfg=config)
TypeError: Can't instantiate abstract class ModelPT with abstract methods list_available_models, setup_training_data, setup_validation_data

Steps/Code to reproduce bug

ngc registry resource download-version nvidia/riva/riva_quickstart conda create --name riva python=3.8 conda activate riva cd riva_quickstart_v1.5.0-beta pip install nemo-toolkit[all] pip3 install riva_api-1.5.0b0-py3-none-any.whl pip3 install nemo2riva-1.5.0b0-py3-none-any.whl pip3 install nvidia-pyindex pip3 install nemo2riva nemo2riva --out moneypenny-confirmer-v1.riva /data/models/moneypenny-confirmer-v1.nemo

-- Running nemo2riva then throws the error outlined above --

Expected behavior

The nemo file should be converted into a riva file

Environment overview (please complete the following information)

Environment details

If NVIDIA docker image is used you don't need to specify these. Otherwise, please provide:

ryanleary commented 2 years ago

Could you please turn on debug logging and share the returned logs?

peterhanlon commented 2 years ago

Sure but how do I do that?

peterhanlon commented 2 years ago

I'm assuming "--verbose=DEBUG" is the correct switch to activate the debug logging? The output is as follows:

nemo2riva --verbose=DEBUG --out moneypenny-confirmer-v1.riva /data/models/moneypenny-confirmer-v1.nemo [NeMo W 2021-08-31 09:58:37 optimizers:47 rank:0] Apex was not found. Using the lamb optimizer will error out. INFO: Logging level set to 10 INFO: Restoring NeMo model from '/data/models/moneypenny-confirmer-v1.nemo' [NeMo D 2021-08-31 09:58:39 common:468 rank:0] Model instantiation from target class nemo.collections.asr.models.ctc_bpe_models.EncDecCTCModelBPE failed with following error. Falling back to cls. Traceback (most recent call last): File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/site-packages/nemo/core/classes/common.py", line 448, in from_config_dict imported_cls = import_class_by_path(target_cls) File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/site-packages/nemo/utils/model_utils.py", line 437, in import_class_by_path mod = import(path, fromlist=[class_name]) File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/site-packages/nemo/collections/asr/init.py", line 15, in from nemo.collections.asr import data, losses, models, modules File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/site-packages/nemo/collections/asr/models/init.py", line 16, in from nemo.collections.asr.models.classification_models import EncDecClassificationModel File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/site-packages/nemo/collections/asr/models/classification_models.py", line 28, in from nemo.collections.asr.data import audio_to_label_dataset File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/site-packages/nemo/collections/asr/data/audio_to_label_dataset.py", line 15, in from nemo.collections.asr.data import audio_to_label File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/site-packages/nemo/collections/asr/data/audio_to_label.py", line 22, in from nemo.collections.asr.parts.preprocessing.segment import available_formats as valid_sf_formats File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/site-packages/nemo/collections/asr/parts/preprocessing/init.py", line 16, in from nemo.collections.asr.parts.preprocessing.features import ( File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/site-packages/nemo/collections/asr/parts/preprocessing/features.py", line 46, in from nemo.collections.asr.parts.preprocessing.perturb import AudioAugmentor File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/site-packages/nemo/collections/asr/parts/preprocessing/perturb.py", line 52, in from nemo.collections.common.parts.preprocessing import collections, parsers File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/site-packages/nemo/collections/common/init.py", line 16, in from nemo.collections.common import data, losses, parts, tokenizers File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/site-packages/nemo/collections/common/tokenizers/init.py", line 17, in from nemo.collections.common.tokenizers.huggingface.auto_tokenizer import AutoTokenizer File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/site-packages/nemo/collections/common/tokenizers/huggingface/init.py", line 15, in from nemo.collections.common.tokenizers.huggingface.auto_tokenizer import AutoTokenizer File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/site-packages/nemo/collections/common/tokenizers/huggingface/auto_tokenizer.py", line 18, in from transformers import AutoTokenizer as AUTOTOKENIZER File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/site-packages/transformers/init.py", line 43, in from . import dependency_versions_check File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/site-packages/transformers/dependency_versions_check.py", line 36, in from .file_utils import is_tokenizers_available File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/site-packages/transformers/file_utils.py", line 175, in _onnx_available = importlib.util.find_spec("onnxruntime") is not None File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/importlib/util.py", line 114, in find_spec raise ValueError('{}.spec is None'.format(name)) ValueError: onnxruntime.spec is None

ERROR: Nemo2Jarvis: Failed to restore model from NeMo file : /data/models/moneypenny-confirmer-v1.nemo. Please make sure you have the latest NeMo package installed with [all] dependencies. Traceback (most recent call last): File "/home/ubuntu/anaconda3/envs/riva/bin/nemo2riva", line 8, in sys.exit(nemo2riva()) File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/site-packages/nemo2riva/cli/nemo2riva.py", line 50, in nemo2riva Nemo2Riva(args) File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/site-packages/nemo2riva/convert.py", line 42, in Nemo2Riva raise e File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/site-packages/nemo2riva/convert.py", line 35, in Nemo2Riva model = ModelPT.restore_from(restore_path=nemo_in) File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/site-packages/nemo/core/classes/modelPT.py", line 270, in restore_from instance = cls._save_restore_connector.restore_from( File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/site-packages/nemo/core/connectors/save_restore_connector.py", line 136, in restore_from instance = calling_cls.from_config_dict(config=conf) File "/home/ubuntu/anaconda3/envs/riva/lib/python3.8/site-packages/nemo/core/classes/common.py", line 473, in from_config_dict instance = cls(cfg=config) TypeError: Can't instantiate abstract class ModelPT with abstract methods list_available_models, setup_training_data, setup_validation_data

peterhanlon commented 2 years ago

Does anyone have any idea how to resolve this one?

briebe commented 2 years ago

@peterhanlon i had this issue creating my own pytorch container and installing nemo tools 1.3 and 1.1 (because this version is mentioned in the nvidia guide) and dependencies. I "solved it" by useing ngc riva 1.4. quickstart, i was able to use nemo2riva after installing nemotools 1.3 inside the deployed riva-client container.

ryanleary commented 2 years ago

Please try using the latest nemo2riva. Which model architecture are you trying to convert? Is it one supported by Riva? The latest tooling is a bit more verbose catching unsupported networks.