Open DeepBehavier opened 3 months ago
I used 6 A100 to train the model of stage2_e2e. After completing the fourth epoch, an error occurred.Error as shown below。 This mistake is easy to repeat. Please help me to solve this problem.
Have you solved the problem?
same error Have you solved the problem
I used 6 A100 to train the model of stage2_e2e. After completing the fourth epoch, an error occurred.Error as shown below。
This mistake is easy to repeat.
Please help me to solve this problem.