About loading the checkpoint file for generative task

Dear Song,

I'm happy to hear that you successfully ran our project. When you open the model directory, you will find a zero_to_fp32.py script. This script extracts fp32 consolidated weights from zero stage 2 and 3 DeepSpeed checkpoints. You can run this script, which will automatically generate a model file according to your zero stage during finetuning. For example,

> python zero_to_fp32.py "../model" "global_step0/0model.pkl"

Furthermore, when you use multiple GPUs to finetune the model, you will get multiple checkpoint files starting with zero_pp_rank, the number of which is the same as the number of GPUs. At this point, if you want to further use the model for inference or other tasks, you need to use zero_to_fp32.py to merge these files into a usable single file. The details involve model saving in DeepSpeed. You can refer to https://github.com/microsoft/DeepSpeed for more information.

Best, Yin Fang

zjunlp / MolGen

About loading the checkpoint file for generative task #2