Closed xuezhongcailian closed 4 years ago
one of the variables needed for gradient computation has been modified by an inplace operation
Hi, I have never tried torch 1.2 in this repo. Can you try with torch 1.0.0 as written in requirements.txt?
Or maybe you can avoid that error by replacing in-place operations in yolo_layer.py.
May I ask which line is the inplace operation that need to be modified? When I train the model on my own data, error occurs "IndexError: index 76 is out of bounds for dimension 3 with size 76"
@milliema your error seems to be different from the one caused by in-place operations. Could you show me whole of the error messages? I cannot say anything for sure otherwise.
@milliema your error seems to be different from the one caused by in-place operations. Could you show me whole of the error messages? I cannot say anything for sure otherwise. Thanks for your quick reply! I've modified the code a little bit to be used on my own datasets, the modifications include: 1) change the N_CLASSES in cfg file; 2) modify the train/val data directory following coco format; Then, when I run train.py the 1st error occurs as below: Traceback (most recent call last): File "train_am.py", line 237, in
main() File "train_am.py", line 185, in main loss = model(imgs, targets) File "/home/ubuntu/miniconda3/envs/autom/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call result = self.forward(*input, kwargs) File "/media/ubuntu/HDD/project/PyTorch_Gaussian_YOLOv3/models/yolov3.py", line 154, in forward x, loss_dict = module(x, targets) File "/home/ubuntu/miniconda3/envs/autom/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call result = self.forward(input, kwargs) File "/media/ubuntu/HDD/project/PyTorch_Gaussian_YOLOv3/models/yolo_layer.py", line 188, in forward obj_mask[b] = 1-pred_best_iou File "/home/ubuntu/miniconda3/envs/autom/lib/python3.6/site-packages/torch/tensor.py", line 325, in rsub return _C._VariableFunctions.rsub(self, other) RuntimeError: Subtraction, the -
operator, with a bool tensor is not supported. If you are trying to invert a mask, use the~
orbitwise_not()
operator instead.
Then I change the code "obj_mask[b] = 1-pred_best_iou" in yolo_layer.py to "obj_mask[b] = ~pred_best_iou", the 2nd error occurs as below:
Setting Arguments.. : Namespace(cfg='config/automotive_default.cfg', checkpoint=None, checkpoint_dir='checkpoints', checkpoint_interval=1000, debug=False, eval_interval=4000, n_cpu=0, tfboard_dir=None, use_cuda=True, weights_path='/media/ubuntu/HDD/project/PyTorch_Gaussian_YOLOv3/gaussian_yolov3_coco.pth')
train_am.py:57: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
cfg = yaml.load(f)
successfully loaded config file: {'MODEL': {'TYPE': 'YOLOv3', 'BACKBONE': 'darknet53', 'ANCHORS': [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]], 'ANCH_MASK': [[6, 7, 8], [3, 4, 5], [0, 1, 2]], 'N_CLASSES': 2, 'GAUSSIAN': True}, 'TRAIN': {'LR': 0.001, 'MOMENTUM': 0.9, 'DECAY': 0.0005, 'BURN_IN': 1000, 'MAXITER': 500000, 'STEPS': '(400000, 450000)', 'BATCHSIZE': 4, 'SUBDIVISION': 16, 'IMGSIZE': 608, 'LOSSTYPE': 'l2', 'IGNORETHRE': 0.7, 'GRADIENT_CLIP': 2000.0}, 'AUGMENTATION': {'RANDRESIZE': True, 'JITTER': 0.3, 'RANDOM_PLACING': True, 'HUE': 0.1, 'SATURATION': 1.5, 'EXPOSURE': 1.5, 'LRFLIP': True, 'RANDOM_DISTORT': True}, 'TEST': {'CONFTHRE': 0.8, 'NMSTHRE': 0.45, 'IMGSIZE': 416}, 'NUM_GPUS': 1}
effective_batch_size = batch_size iter_size = 4 16
Gaussian YOLOv3
Gaussian YOLOv3
Gaussian YOLOv3
loading darknet weights.... /media/ubuntu/HDD/project/PyTorch_Gaussian_YOLOv3/gaussian_yolov3_coco.pth
using cuda
loading annotations into memory...
Done (t=0.05s)
creating index...
index created!
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
evaluating...
obj_mask torch.Size([4, 3, 76, 76])
0 tensor(0) 0 0
obj_mask torch.Size([4, 3, 76, 76])
1 tensor(0) 0 76
Traceback (most recent call last):
File "train_am.py", line 237, in
Do you have any ides about the error? Thanks for your help.
IndexError: index 76 is out of bounds for dimension 3 with size 76
The 3rd dimension represents x-index on the feature map.
In your dataset, some of the boxes might be located (partially) outside of the image.
Is it possible to clip those boxes to the image in your dataset?
hi, this can use pytorch1.2 to train?