Open shamanez opened 2 months ago
+1
I also strongly need this tool
https://github.com/NVIDIA/Megatron-LM/issues/756#issuecomment-2126186633
Here is some information about converting Huggingface checkpoints to Nemo. It seems there is a conversion script available on GitHub. Although I haven't confirmed it, it might be useful. https://medium.com/karakuri/train-moes-on-aws-trainium-a0ebb599fbda and https://github.com/abeja-inc/Megatron-LM
As mentioned here, having a proper MOE/Mixtral checkpoint converter script will help us to fine-tune Mixtral.