Closed mmoollllee closed 1 year ago
By running everything on the local system I now got hundred lines of the following error messages:
[...]
/opt/conda/conda-bld/pytorch_1682343967769/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [7024,0,0], thread: [60,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1682343967769/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [7024,0,0], thread: [61,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1682343967769/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [7024,0,0], thread: [62,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1682343967769/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [7024,0,0], thread: [63,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
Here is the jupyter notebook I'm using via JarvisLabs.ai: https://github.com/mmoollllee/yolo-nas-object-blurring/blob/main/notebooks/train-yolo-nas.ipynb
Got it. Error occurs when labels extend beyond the edge of the pictures, which happens in label-studio quite some times. I imported the dataset to roboflow (which mentioned the problem on upload already) and reexported and the problem is gone :)
💡 Your Question
I try to train in Google Colab on a custom dataset. The runtime always crashes on the 43rd item in Epoch 0. Is there a way to say which image is being processed in that moment?
Versions
No response