Open lianghsun opened 1 month ago
I'm curious if by retraining you might mean continued training? https://docs.nvidia.com/nemo-framework/user-guide/latest/llms/allmodels/continuetraining.html?highlight=continued%2520training#configure-continual-learning
Regarding your error, the decoder module will be MegatronGPTModel.model.decoder
I believe. But you can check by inspecting the MegatronGPTModel
Description
I am retraining a LLaMA3 model. Due to the limited size of my dataset, I attempted to use
freeze_updates
as referenced in the NVIDIA NeMo documentation. My configuration is as follows:However, I encountered the following error:
I also tried changing decoder to
encoder
orjoint
, but I still faced errors. I would like to ask how to properly configure this setting?Additionally, within the NeMo framework, is it possible to freeze specific layers, such as only the attention layer? If so, how can I achieve this? Thanks!