Open wzp2019201645 opened 2 years ago
txt可能没改对
报错
Message=Caught FileNotFoundError in DataLoader worker process 0. Original Traceback (most recent call last): File "D:\anaconda\envs\pytorch\lib\site-packages\torch\utils\data_utils\worker.py", line 178, in _worker_loop data = fetcher.fetch(index) File "D:\anaconda\envs\pytorch\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "D:\anaconda\envs\pytorch\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in
您好,遇到了这样的情况 initialize network with normal type
Load weights model_data/voc_weights_resnet.pth.
线程 0x3 已退出,返回值为 0 (0x0)。 线程 0x2 已退出,返回值为 0 (0x0)。 Start Train
Epoch 1/50: 0%| | 0/4137 [00:00<?, ?it/s<class 'dict'>]
Epoch 1/50: 0%| | 0/4137 [00:06<?, ?it/s<class 'dict'>]
Caught FileNotFoundError in DataLoader worker process 0. Original Traceback (most recent call last): File "D:\anaconda\envs\pytorch\lib\site-packages\torch\utils\data_utils\worker.py", line 178, in _worker_loop data = fetcher.fetch(index) File "D:\anaconda\envs\pytorch\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "D:\anaconda\envs\pytorch\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in
File "D:\复现算法\faster-rcnn-pytorch-master\utils\utils_fit.py", line 38, in fit_one_epoch pbar.update(1) File "D:\复现算法\faster-rcnn-pytorch-master\train.py", line 211, in
fit_one_epoch(model, train_util, loss_history, optimizer, epoch, epoch_step, epoch_step_val, gen, gen_val, end_epoch, Cuda) 已加载“utils.utils_fit” 已加载“main”
中文
你好,我在训练自己的数据集之前,按照你的要求更改了相应的地方。但是当我运行train.py之后,出现了下面的bug。我看不太懂,请问这是什么原因造成的? initialize network with normal type Epoch 1/25: 0%| | 0/1665 [00:00<?, ?it/s<class 'dict'>]Start Train Epoch 1/25: 0%| | 2/1665 [00:03<45:23, 1.64s/it, lr=0.0001, roi_cls=nan, roi_loc=nan, rpn_cls=6.98e+28, rpn_loc=5.37e+29, total_loss=nan] C:/cb/pytorch_1000000000000/work/aten/src/ATen/native/cuda/IndexKernel.cu:84: block: [0,0,0], thread: [0,0,0] Assertion
index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed. C:/cb/pytorch_1000000000000/work/aten/src/ATen/native/cuda/IndexKernel.cu:84: block: [0,0,0], thread: [1,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed. C:/cb/pytorch_1000000000000/work/aten/src/ATen/native/cuda/IndexKernel.cu:84: block: [0,0,0], thread: [2,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed. C:/cb/pytorch_1000000000000/work/aten/src/ATen/native/cuda/IndexKernel.cu:84: block: [0,0,0], thread: [3,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed. Epoch 1/25: 0%| | 2/1665 [00:04<57:37, 2.08s/it, lr=0.0001, roi_cls=nan, roi_loc=nan, rpn_cls=6.98e+28, rpn_loc=5.37e+29, total_loss=nan] Traceback (most recent call last): File "E:/WangZhongpeng/adversarial_defence/faster-rcnn-pytorch-master/train.py", line 212, in fit_one_epoch(model, train_util, loss_history, optimizer, epoch, epoch_step, epoch_step_val, gen, gen_val, end_epoch, Cuda) File "E:\WangZhongpeng\adversarial_defence\faster-rcnn-pytorch-master\utils\utils_fit.py", line 25, in fit_one_epoch rpn_loc, rpn_cls, roi_loc, roi_cls, total = train_util.train_step(images, boxes, labels, 1) File "E:\WangZhongpeng\adversarial_defence\faster-rcnn-pytorch-master\nets\frcnn_training.py", line 325, in train_step losses = self.forward(imgs, bboxes, labels, scale) File "E:\WangZhongpeng\adversarial_defence\faster-rcnn-pytorch-master\nets\frcnn_training.py", line 311, in forward roi_loc_loss = self._fast_rcnn_loc_loss(roi_loc, gt_roi_loc, gt_roi_label.data, self.roi_sigma) File "E:\WangZhongpeng\adversarial_defence\faster-rcnn-pytorch-master\nets\frcnn_training.py", line 221, in _fast_rcnn_loc_loss pred_loc = pred_loc[gt_label > 0] RuntimeError: CUDA error: device-side assert triggered
这个问题我也遇到了,原因是annotation.txt中每个类别应该从0开始编号,在训练过程中代码写了一个label+1逻辑,将编号0留出来作为背景,所以annotation.txt应该从0开始编号,否则label+1后会有类别超出范围
你好,我在训练自己的数据集之前,按照你的要求更改了相应的地方。但是当我运行train.py之后,出现了下面的bug。我看不太懂,请问这是什么原因造成的? initialize network with normal type Epoch 1/25: 0%| | 0/1665 [00:00<?, ?it/s<class 'dict'>]Start Train Epoch 1/25: 0%| | 2/1665 [00:03<45:23, 1.64s/it, lr=0.0001, roi_cls=nan, roi_loc=nan, rpn_cls=6.98e+28, rpn_loc=5.37e+29, total_loss=nan] C:/cb/pytorch_1000000000000/work/aten/src/ATen/native/cuda/IndexKernel.cu:84: block: [0,0,0], thread: [0,0,0] Assertion
fit_one_epoch(model, train_util, loss_history, optimizer, epoch, epoch_step, epoch_step_val, gen, gen_val, end_epoch, Cuda)
File "E:\WangZhongpeng\adversarial_defence\faster-rcnn-pytorch-master\utils\utils_fit.py", line 25, in fit_one_epoch
rpn_loc, rpn_cls, roi_loc, roi_cls, total = train_util.train_step(images, boxes, labels, 1)
File "E:\WangZhongpeng\adversarial_defence\faster-rcnn-pytorch-master\nets\frcnn_training.py", line 325, in train_step
losses = self.forward(imgs, bboxes, labels, scale)
File "E:\WangZhongpeng\adversarial_defence\faster-rcnn-pytorch-master\nets\frcnn_training.py", line 311, in forward
roi_loc_loss = self._fast_rcnn_loc_loss(roi_loc, gt_roi_loc, gt_roi_label.data, self.roi_sigma)
File "E:\WangZhongpeng\adversarial_defence\faster-rcnn-pytorch-master\nets\frcnn_training.py", line 221, in _fast_rcnn_loc_loss
pred_loc = pred_loc[gt_label > 0]
RuntimeError: CUDA error: device-side assert triggered
index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed. C:/cb/pytorch_1000000000000/work/aten/src/ATen/native/cuda/IndexKernel.cu:84: block: [0,0,0], thread: [1,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed. C:/cb/pytorch_1000000000000/work/aten/src/ATen/native/cuda/IndexKernel.cu:84: block: [0,0,0], thread: [2,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed. C:/cb/pytorch_1000000000000/work/aten/src/ATen/native/cuda/IndexKernel.cu:84: block: [0,0,0], thread: [3,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed. Epoch 1/25: 0%| | 2/1665 [00:04<57:37, 2.08s/it, lr=0.0001, roi_cls=nan, roi_loc=nan, rpn_cls=6.98e+28, rpn_loc=5.37e+29, total_loss=nan] Traceback (most recent call last): File "E:/WangZhongpeng/adversarial_defence/faster-rcnn-pytorch-master/train.py", line 212, in