bigscience-workshop / Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2
Other
1.31k stars 213 forks source link

[checkpoints] replace bf16 with fp32 checkpoint weights #327

Open stas00 opened 2 years ago

stas00 commented 2 years ago

as requested by @thomasw21 this is a little hack script to to replace half-precision weights with fp32 weights in the existing HF transformers checkpoint seeded from the universal checkpoint, see the script for the steps.

stas00 commented 2 years ago

@thomasw21, at your convenience - no rush at all - we can merge this when it meets your needs.

thomasw21 commented 2 years ago

Hey sorry for delaying this, I haven't had the chance to play around with it yet. Let me know if it's blocking and we'll merge then.

stas00 commented 2 years ago

Absolutely no rush, you're the only one who asked for it. So it's totally up to you Thomas.