In the training phase when it calls val.py to evaluate the model performance it will suspend for about half an hour and then display killed. Use the command:
dmesg | egrep -i -B100 "killed process"
It would report that the python process was killed because out of memory error occured.
In the training phase when it calls val.py to evaluate the model performance it will suspend for about half an hour and then display killed. Use the command:
It would report that the python process was killed because out of memory error occured.
How to solve this problem?