OpenMOSS / CoLLiE

Collaborative Training of Large Language Models in an Efficient Way
https://openlmlab-collie.readthedocs.io
Apache License 2.0
405 stars 58 forks source link

No log information #163

Closed BeastyZ closed 5 months ago

BeastyZ commented 5 months ago

Hi, when I trained the model, there was no log output information. Please fix this bug in time.

KaiLv69 commented 5 months ago

Hi, Which specific type of log are you looking for? It should be outputting the DeepSpeed logs, and after the start of the training loop, a progress bar should appear to indicate the training progress.

By the way, if you are launching the task with srun, you might need to use srun -u *** python -u train.py to show the progress bar, depending on the srun configuration.

BeastyZ commented 5 months ago

I'm using the srun command without -u option. In this case, I can see the log information of ppl, but no progress bar of training. Maybe you are right, I will try the srun -u *** python -u train.py. Thanks a lot.