yjh0410 / FCOS-RT_PyTorch

A real-time version of FCOS, inspired by FCOSv2.
48 stars 10 forks source link

訓練出現錯誤 #1

Open leehao178 opened 4 years ago

leehao178 commented 4 years ago

HI @yjh0410 你好 ubuntu 16.04 cuda 10.0 python 3.6 pytorch 1.1.0 torchvision 0.3.0

訓練自己的訓練集時 使用voc0712.py, 並且有修改行66 int改為float才能過要不然會跳ValueError: invalid literal for int() with base 10: '190.0', 改完後開始訓練一陣子就會跳以下錯誤

/opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [384,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [385,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [386,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [387,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [388,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [389,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [390,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [391,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [392,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [393,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [394,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [395,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [396,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [397,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [398,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [399,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [400,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [401,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [402,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [403,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [404,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [405,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [406,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [407,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [408,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [409,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [410,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [411,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [412,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [413,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [414,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [415,0,0] Assertion *input >= 0. && *input <= 1. failed. Epoch[3 / 250] || iter 210 || Loss: 23.8245 || || lr: 0.00005026 || || input size: 320 || Traceback (most recent call last): File "train_voc.py", line 206, in train(fcos_lite, device) File "train_voc.py", line 151, in train cls_loss, ctn_loss, box_loss = tools.loss(out, targets, num_classes=args.num_classes) File "/home/danny/Lab/FCOS/FCOS-LITE/tools.py", line 120, in loss box_loss = torch.mean(torch.sum(box_loss_func(iou, gt_iou) gt_pos, dim=-1) / N_pos) RuntimeError: CUDA error: device-side assert triggered /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [192,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [193,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [194,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [195,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [196,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [197,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [198,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [199,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [200,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [201,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [202,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [203,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [204,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [205,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [206,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [207,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [208,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [209,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [210,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [211,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [212,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [213,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [214,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [215,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [216,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [217,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [218,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [219,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [220,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [221,0,0] Assertion `input >= 0. && input <= 1.` failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [222,0,0] Assertion *input >= 0. && *input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [223,0,0] Assertion `input >= 0. && *input <= 1.` failed.

yjh0410 commented 4 years ago

你好! voc2007.py这个文件我没有遇到你说的那个问题。 另外,从报错的问题上来看,应该是因为在计算Loss的时候,BCE函数出现了问题,具体问题我也现在不能判断,由于疫情,我现在待在家里,没法训练这个FCOS-LITE,所以很多训练过程中可能出现的bug本人也尚不清楚。所以这个项目还是个半成品,因此给你带来了些不必要的麻烦,实在抱歉!我现在也是很苦恼,但校方因为疫情的现状,暂不能通知开血,本人也无法返校,还望谅解! 待返校之后,我一定会调试好这个项目!

------------------ 原始邮件 ------------------ 发件人: "leehao178"<notifications@github.com>; 发送时间: 2020年3月4日(星期三) 中午12:57 收件人: "yjh0410/FCOS-LITE"<FCOS-LITE@noreply.github.com>; 抄送: "ら .Secret"<1394571815@qq.com>;"Mention"<mention@noreply.github.com>; 主题: [yjh0410/FCOS-LITE] 訓練出現錯誤 (#1)

HI @yjh0410 你好 ubuntu 16.04 cuda 10.0 python 3.6 pytorch 1.1.0 torchvision 0.3.0

訓練自己的訓練集時 使用voc0712.py, 並且有修改行66 int改為float才能過要不然會跳ValueError: invalid literal for int() with base 10: '190.0', 改完後開始訓練一陣子就會跳以下錯誤

/opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [384,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [385,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [386,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [387,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [388,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [389,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [390,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [391,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [392,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [393,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [394,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [395,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [396,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [397,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [398,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [399,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [400,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [401,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [402,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [403,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [404,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [405,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [406,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [407,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [408,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [409,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [410,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [411,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [412,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [413,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [414,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [415,0,0] Assertion input >= 0. && input <= 1. failed. Epoch[3 / 250] || iter 210 || Loss: 23.8245 || || lr: 0.00005026 || || input size: 320 || Traceback (most recent call last): File "train_voc.py", line 206, in train(fcos_lite, device) File "train_voc.py", line 151, in train cls_loss, ctn_loss, box_loss = tools.loss(out, targets, num_classes=args.num_classes) File "/home/danny/Lab/FCOS/FCOS-LITE/tools.py", line 120, in loss box_loss = torch.mean(torch.sum(box_loss_func(iou, gt_iou) gt_pos, dim=-1) / N_pos) RuntimeError: CUDA error: device-side assert triggered /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [192,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [193,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [194,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [195,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [196,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [197,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [198,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [199,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [200,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [201,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [202,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [203,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [204,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [205,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [206,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [207,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [208,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [209,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [210,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [211,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [212,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [213,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [214,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [215,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [216,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [217,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [218,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [219,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [220,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [221,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [222,0,0] Assertion input >= 0. && input <= 1. failed. /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype , const Dtype , Dtype ) [with Dtype = float, Acctype = float]: block: [10,0,0], thread: [223,0,0] Assertion input >= 0. && *input <= 1. failed.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

leehao178 commented 4 years ago

知道了~感謝你 期待你這個不錯的項目! 也希望疫情可以得到緩解!

yjh0410 commented 4 years ago

非常感谢! 一旦有好的结果,我会第一时间更新到github上~

------------------ 原始邮件 ------------------ 发件人: "leehao178"<notifications@github.com>; 发送时间: 2020年3月4日(星期三) 中午1:24 收件人: "yjh0410/FCOS-LITE"<FCOS-LITE@noreply.github.com>; 抄送: "ら .Secret"<1394571815@qq.com>;"Mention"<mention@noreply.github.com>; 主题: Re: [yjh0410/FCOS-LITE] 訓練出現錯誤 (#1)

知道了~感謝你 期待你這個不錯的項目! 也希望疫情可以得到緩解!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

leehao178 commented 4 years ago

作者您好 我有用新版的測試,也會有錯誤~包含voc2007voc2012與我自己的訓練集都會

因為我不太熟練使用,不過以下是我觀察的問題, https://github.com/yjh0410/FCOS-LITE/blob/6a474d9ac3385e5c8de47734f853349e79e34b0d/train_voc.py#L147 這邊的out出來會是全部都「nan」,造成進入tools.loss算出來的loss會是nan,不知是不是這樣所以才出現錯誤的! 最後感謝作者 :)

yjh0410 commented 4 years ago

你好! 最新的代码,我在我的电脑上并没有出现out=nan的问题。 对于nan的问题,如果是我的话,我会从以下几点去检查: 1.把每一次的out都print出来,看看是从哪一步开始出现的NaN; 2.同时,我也会将输入的image数据也print一下,确定输入是否包含NaN。 以上两点是我常用的办法; 如果是1引起的问题,我怀疑可能是网络训练过程中发散了~这种问题我经常会遇到。 如果是2引起的问题,emmmm......数据预处理的代码我用了很久,包括我自己准复现的yolo-v2、v3也都是用的那段预处理代码,并未在输入这一块出现过nan。

最后,十分感谢你提出的问题! 我也想尽力去维护这个项目,但很多潜在的问题只有完整的train一下才能发现,我也实在是无能为力,尽管我也很着急。这个项目还处于早期开发阶段,因此也会有很多问题,给你带来了不必要的麻烦,还望见谅!

leehao178 commented 4 years ago

好喔 知道了 我可以去試試看 謝謝!

yjh0410 commented 4 years ago

你好! 我刚刚在tools.py文件中发现了一处bug,就是我的BCE_focal_loss这个函数写错了,已经修改了,方便的时候你可以再试试,还会不会出现NaN的问题~

yjh0410 commented 4 years ago

你好,,,我似乎解决了我模型的bug。 在tools.py中,计算loss时有一个N_pos变量,这个变量后面会作为除法中的除数,但它可能包含0,这个可能性导致了诸如NaN的问题,我fix了这个bug,现在应该好使了

leehao178 commented 4 years ago

作者您好

經過你的修復,已經可以正常訓練了,不過我有發現每當我要存 pth時,一開始存檔可以,可是隔一陣子電腦就會死機,整個畫面都沒反應只能重開機,所以我想不確定是不是存檔造成顯卡閃存剛好爆掉之類的,因為死機時顯卡閃存剛好都接近全滿,btw我使用的是2080TI, 後來我把batch從8調到4就可以正常訓練完畢了, 最後非常感謝作者勤勞的更新, 真的非常感謝, 也期待作者後續的作品!

yjh0410 commented 4 years ago

你好! 我用的也是2080TI,训练的时候batch设为32,能够完整地训练完,没有出现死机问题。 目前我的测试结果挺差的,在VOC上连40的mAP都没有,而且FPS也不是很高,没有我的yolo-v2快。接下来我可能要做一些调参或者魔改了,测试代码我push到github上了。