Closed chang-github-00 closed 2 years ago
Hi, thanks for your interest in our work!
Our training framework is based on deepspeed
, it may exist some errors when directly running the script for pertaining using torch.distributed.launch
.
Could you show the detailed error information about deepspeed
?
We will check and release the new version of our code. Thank you!
Dear authors: Thank you for your inspiring work ! When we try to follow your work and run the following scripts for pre-training
we get error :
It seems that this error only appears when performing DDP training, besides we have tried deep speed but an error occurs during loading optimizer at line 168 in trainer.py.
Wonder how I can solve this. Thank you very much!