NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
https://docs.nvidia.com/nemo-framework/user-guide/latest/overview.html
Apache License 2.0
12.17k stars 2.53k forks source link

ValueError: Error instantiating 'nemo.collections.asr.modules.conv_asr.SpeakerDecoder' : invalid literal for int() with base 10: ',' #2781

Closed briebe closed 3 years ago

briebe commented 3 years ago

Describe the bug

Following "NGC Pretrained Checkpoints"

from

https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/asr/speaker_recognition/results.html

Steps/Code to reproduce bug

Running:

nemo_asr.models.EncDecSpeakerLabelModel.from_pretrained(model_name="speakerrecognition_speakernet")

Expected behavior

no error when loading pretrained model

Environment overview (please complete the following information)

same on dgx with working riva-client container that has been used for other applications

Environment details

nithinraok commented 3 years ago

I think hydra config error, can you check with "speakerverification_speakernet" model? speakerrecognition_speakernet pretrained model will be removed from next release.

mhfarani1374 commented 3 years ago

I have the same problem with this checkpoint file for identification. Is that trained on an4? how can I get voxceleb trained model? thanks

n8zach commented 1 year ago

speakerverification_speakernet.nemo is still available here: https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/speakerrecognition_speakernet/files

And speaker recognition example says it will work with .nemo file here: https://github.com/NVIDIA/NeMo/blob/2430d0f68c1fb2eb4ff92887781af217baccad74/examples/speaker_tasks/recognition/conf/speaker_identification_infer.yaml#L13

I can run the example successfully with my set of embedding and test files in "cosine_similarity" mode.

If I switch the model path to a speakerverification_speakernet.nemo downloaded from the link above, I get this error...

File "/mnt/c/Users/foo/Git/NeMo/bar/speaker_identification_infer.py", line 152, in main() File "/usr/local/lib/python3.10/dist-packages/nemo_toolkit-1.13.0rc0-py3.10.egg/nemo/core/config/hydra_runner.py", line 105, in wrapper _run_hydra( File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/utils.py", line 389, in _run_hydra _run_app( File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/utils.py", line 452, in _run_app run_and_report( File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/utils.py", line 216, in run_and_report raise ex File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/utils.py", line 213, in run_and_report return func() File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/utils.py", line 453, in lambda: hydra.run( File "/usr/local/lib/python3.10/dist-packages/hydra/internal/hydra.py", line 132, in run = ret.return_value File "/usr/local/lib/python3.10/dist-packages/hydra/core/utils.py", line 260, in return_value raise self._return_value File "/usr/local/lib/python3.10/dist-packages/hydra/core/utils.py", line 186, in run_job ret.return_value = task_function(task_cfg) File "/mnt/c/Users/foo/Git/NeMo/bar/speaker_identification_infer.py", line 64, in main speaker_model = EncDecSpeakerLabelModel.restore_from(model_path) File "/usr/local/lib/python3.10/dist-packages/nemo_toolkit-1.13.0rc0-py3.10.egg/nemo/core/classes/modelPT.py", line 316, in restore_from instance = cls._save_restore_connector.restore_from( File "/usr/local/lib/python3.10/dist-packages/nemo_toolkit-1.13.0rc0-py3.10.egg/nemo/core/connectors/save_restore_connector.py", line 235, in restore_from loaded_params = self.load_config_and_state_dict( File "/usr/local/lib/python3.10/dist-packages/nemo_toolkit-1.13.0rc0-py3.10.egg/nemo/core/connectors/save_restore_connector.py", line 158, in load_config_and_state_dict instance = calling_cls.from_config_dict(config=conf, trainer=trainer) File "/usr/local/lib/python3.10/dist-packages/nemo_toolkit-1.13.0rc0-py3.10.egg/nemo/core/classes/common.py", line 506, in from_config_dict raise e File "/usr/local/lib/python3.10/dist-packages/nemo_toolkit-1.13.0rc0-py3.10.egg/nemo/core/classes/common.py", line 498, in from_config_dict instance = cls(cfg=config, trainer=trainer) File "/usr/local/lib/python3.10/dist-packages/nemo_toolkit-1.13.0rc0-py3.10.egg/nemo/collections/asr/models/label_models.py", line 166, in init self.decoder = EncDecSpeakerLabelModel.from_config_dict(cfg.decoder) File "/usr/local/lib/python3.10/dist-packages/nemo_toolkit-1.13.0rc0-py3.10.egg/nemo/core/classes/common.py", line 467, in from_config_dict instance = hydra.utils.instantiate(config=config) File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/instantiate/_instantiate2.py", line 222, in instantiate return instantiate_node( File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/instantiate/_instantiate2.py", line 339, in instantiate_node return _call_target(target, partial, args, kwargs, full_key) File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/instantiate/_instantiate2.py", line 97, in _call_target raise InstantiationException(msg) from e hydra.errors.InstantiationException: Error in call to target 'nemo.collections.asr.modules.conv_asr.SpeakerDecoder': ValueError("invalid literal for int() with base 10: ','")