NVIDIA / ngc-container-replicator

NGC Container Replicator
BSD 3-Clause "New" or "Revised" License
28 stars 12 forks source link

training fail #38

Open alicera opened 2 years ago

alicera commented 2 years ago

image https://docs.nvidia.com/deeplearning/tensorrt/container-release-notes/index.html?from=groupmessage

I train my model with ngc docker. Sometime I train the network (like yolov7 ) , it will let linux disconect and reboot . How can I debug it to find cause root?