Closed iniverno closed 6 months ago
Sorry, the current version of the add_exit_layers.sh
script is only verified on checkpoints generated by EE-LLM. Suppose you have a checkpoint generated by the official Megatron-LM. In that case, you need to determine whether the args exist before accessing and use the default value instead if they do not exist.
If you need it urgently, you can modify tools/checkpoint/checkpoint_converter.py
as described above. We will also test the script and fix this bug within this week.
Please check whether the branch fix/pxc/add_exit_layers
solves the problem
Describe the bug Trying to insert the exit layers on a checkpoint previously saved with Megatron. The conversion script is expecting many EE-specific arguments to be present in the checkpoint.
To Reproduce Using the default parameters in the conversion script, it fails trying to access checkpoint_args.exit_layer_nums