Open xuezu29 opened 2 years ago
Plz check whether there is any other process running in your GPUs.
Plz check whether there is any other process running in your GPUs.
I'm sure there is no other process running on the GPUs.
@ruinmessi This problem will not appear when I restart training. And when the 'iter_time' become longer, I used 'watch nvidia-smi' to check the GPU status, 'Volatile GPU-Util' stay at a low value(0-10% ) for a long time.
Do you have many gt objects in one image? Your problem may be difficult to locate. And I suggest you use line_profiler to test the time consuming of each line in the training loop
previous epoch: The iter_time(1s) is normal.
after a while: The iter_time(6s-10s) is too long.
It happens sometimes , and then it's normal to restart training. Any suggestions? thx a lot!