AttributeError: 'MegatronGPTModel' object has no attribute 'decoder'

NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Apache License 2.0

11.54k stars 2.42k forks source link

Description

I am retraining a LLaMA3 model. Due to the limited size of my dataset, I attempted to use freeze_updates as referenced in the NVIDIA NeMo documentation. My configuration is as follows:

freeze_updates:
  enabled: true  # set to false if you want to disable freezing
  modules:   # list all of the modules you want to have freezing logic for
    decoder: 100

However, I encountered the following error:

AttributeError: 'MegatronGPTModel' object has no attribute 'decoder'

I also tried changing decoder to encoder or joint, but I still faced errors. I would like to ask how to properly configure this setting?

Additionally, within the NeMo framework, is it possible to freeze specific layers, such as only the attention layer? If so, how can I achieve this? Thanks!

NVIDIA / NeMo

AttributeError: 'MegatronGPTModel' object has no attribute 'decoder' #10034

Description