bigscience-workshop / Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2
Other
1.3k stars 211 forks source link

How to continue pre-training Bloom? #366

Open ShinoharaHare opened 1 year ago

ShinoharaHare commented 1 year ago

Hi I'm trying to continue pre-training the bloom-560m on my own dataset on a single GPU. I modified this script to fit my case. However, i cannot figure out how to load the checkpoint.

Is there any guide for what i'm doing?

lwmlyy commented 1 year ago

Hi, did you find any solution on this?

noob-ctrl commented 1 year ago

@ShinoharaHare Hi, have you solved this problem?