NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
https://docs.nvidia.com/nemo-framework/user-guide/latest/overview.html
Apache License 2.0
10.84k stars 2.26k forks source link

Can not load pretrained BERT for punctuation and capitalization #7070

Closed mohammadjavadpirhadi closed 11 months ago

mohammadjavadpirhadi commented 11 months ago

Describe the bug

We are trying to submit a paper to the Social-IQ-2.0 challenge (ICCV2023) and the submission deadline is on July 21, we would be thankful if you can fix this quickly and release a stable version, we have troubles using the package since yesterday

Can not load pretrained BERT model for punctuation and capitalization due to the following error:

RuntimeError                              Traceback (most recent call last)
[<ipython-input-4-5f218ff2c1dd>](https://localhost:8080/#) in <cell line: 3>()
      1 print(f"Available_models: {nemo_nlp.models.PunctuationCapitalizationModel.get_available_model_names()}\n")
      2 
----> 3 pretrained_model = nemo_nlp.models.PunctuationCapitalizationModel.from_pretrained("punctuation_en_bert")
      4 # define the list of queiries for inference
      5 queries = [

4 frames
[/usr/local/lib/python3.10/dist-packages/nemo/core/classes/common.py](https://localhost:8080/#) in from_pretrained(cls, model_name, refresh_cache, override_config_path, map_location, strict, return_config, trainer, save_restore_connector)
    850             )
    851 
--> 852         instance = class_.restore_from(
    853             restore_path=nemo_model_file_in_cache,
    854             override_config_path=override_config_path,

[/usr/local/lib/python3.10/dist-packages/nemo/core/classes/modelPT.py](https://localhost:8080/#) in restore_from(cls, restore_path, override_config_path, map_location, strict, return_config, save_restore_connector, trainer)
    433 
    434         cls.update_save_restore_connector(save_restore_connector)
--> 435         instance = cls._save_restore_connector.restore_from(
    436             cls, restore_path, override_config_path, map_location, strict, return_config, trainer
    437         )

[/usr/local/lib/python3.10/dist-packages/nemo/core/connectors/save_restore_connector.py](https://localhost:8080/#) in restore_from(self, calling_cls, restore_path, override_config_path, map_location, strict, return_config, trainer)
    246         conf, instance, state_dict = loaded_params
    247         state_dict = self.modify_state_dict(conf, state_dict)
--> 248         self.load_instance_with_state_dict(instance, state_dict, strict)
    249         logging.info(f'Model {instance.__class__.__name__} was successfully restored from {restore_path}.')
    250         return instance

[/usr/local/lib/python3.10/dist-packages/nemo/core/connectors/save_restore_connector.py](https://localhost:8080/#) in load_instance_with_state_dict(self, instance, state_dict, strict)
    201             strict: Bool, whether to perform strict checks when loading the state dict.
    202         """
--> 203         instance.load_state_dict(state_dict, strict=strict)
    204         instance._set_model_restore_state(is_being_restored=False)
    205 

[/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in load_state_dict(self, state_dict, strict)
   2039 
   2040         if len(error_msgs) > 0:
-> 2041             raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
   2042                                self.__class__.__name__, "\n\t".join(error_msgs)))
   2043         return _IncompatibleKeys(missing_keys, unexpected_keys)

RuntimeError: Error(s) in loading state_dict for PunctuationCapitalizationModel:
    Unexpected key(s) in state_dict: "bert_model.embeddings.position_ids".

Steps/Code to reproduce bug

pretrained_model = nemo_nlp.models.PunctuationCapitalizationModel.from_pretrained("punctuation_en_bert") line on "Inference using a pretrained model" section of https://github.com/NVIDIA/NeMo/blob/stable/tutorials/nlp/Punctuation_and_Capitalization.ipynb causes this error.

Expected behavior

Loading the pretrained checkpoint successfully.

Environment overview (please complete the following information)

titu1994 commented 11 months ago

You can use this branch, it fixes this issue. Or you can wait till this is merged and then pip install the main branch

https://github.com/NVIDIA/NeMo/pull/7068

titu1994 commented 11 months ago

However we will not be releasing a new version until end of month, so installing from main branch will be required. Instructions are in the readme page

mohammadjavadpirhadi commented 11 months ago

You can use this branch, it fixes this issue. Or you can wait till this is merged and then pip install the main branch

7068

Thanks. You saved our day. This branch works fine.