Open jarvishou829 opened 1 year ago
It seems to be a memory overflow issue. The memory occupied increases abnormally when the evaluation process is almost finished.
You can try do evaluation only on one GPU
Hi, I get the same error. I've read you can fix it by changing batch size, but unfortunately I can't figure out how to do that. Perhaps you could try it, and if it works tell me how to do it? It would be greatly appreaciated.
The evaluation run 6084 samples rather than 6019 samples and closed unexpectly.