Closed 13416157913 closed 6 months ago
How did you generate your model? By default, hf_to_megatron.py
sets the --make_vocab_size_dibisible_by
to 128. You could try changing such value in the script to 1.
Follow-up question: why would you want to set it to exactly 1 anyways?
hello, when I set --make_vocab_size_divisible_by 1 in the finetune script. It get fact --make_vocab_size_divisible_by value is 128?