bigcode-project / starcoder2

Home of StarCoder2!
Apache License 2.0
1.76k stars 158 forks source link

Megatron model weights for StarCoder2-15B #15

Closed christiancosgrove closed 6 months ago

christiancosgrove commented 7 months ago

A year ago, the raw Megatron weights for StarCoder were released.

Would it be possible to release the Megatron weights for StarCoder2, especially the 15B variant?

Also, publishing a script to convert StarCoder2 from Megatron format to HuggingFace format would be helpful. Thanks!

loubnabnl commented 6 months ago

The 15B model was trained with Megatron-Nemo you can find instructions for using the checkpoint here https://docs.nvidia.com/nemo-framework/user-guide/latest/llms/starcoder2/

christiancosgrove commented 6 months ago

Thank you!