EleutherAI / gpt-neox

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
https://www.eleuther.ai/
Apache License 2.0
6.95k stars 1.02k forks source link

How to Load Model from pytorch_model.bin into Trained Model for Text Generation? #1254

Open lieh1203 opened 4 months ago

lieh1203 commented 4 months ago

Hello,  I've recently trained a model using GPT-NeoX and packed the checkpoint global_step1000 into a pytorch_model.bin file using the zero_to_fp32.py script. However, I'm having trouble figuring out how to load this file into the trained model for text generation.  I have reviewed the GPT-NeoX documentation and code, but I still don't understand the specific steps required. Any example code or detailed instructions would be greatly appreciated.  Thank you very much!