Thanks for the nice work. I would appreciate your kind answers on the following two questions.
how long does it take to finish the whole training (three stages)? and what's the training time for each stage?
Since you are also adding RefinedWeb to the model training, do you also have the evaluation on NLP tasks and then compare it with the Phi-1.5 model? I guess there will be some performance drops, but just wonder how much the decrease can be.
Thanks for the nice work. I would appreciate your kind answers on the following two questions.
Looking forward to your reply, thank you.