ultralytics / yolov3

YOLOv3 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
10.2k stars 3.45k forks source link

RuntimeError: Cannot insert a Tensor that requires grad as a constant. Consider making it a parameter or input, or detaching the gradient #1795

Closed jaskiratsingh2000 closed 3 years ago

jaskiratsingh2000 commented 3 years ago

@glenn-jocher please please need your help here as I was not able to run the yolov5 due to errors but I see the same in yolofv3 as well. Can you please help me resolve these errors?

Command ran: python3 train.py --img 416 --batch 10 --epochs 100 --data ./data.yaml --cfg ./models/yolov3.yaml --weights yolov3.pt

I am running this on my local and I have already ran on your environerments as well so please consider taking a look here that how can I resolve this error.

YOLOv3 πŸš€ v9.5.0-13-g1be3170 torch 1.7.0a0+57bffc3 CPU

Namespace(adam=False, artifact_alias='latest', batch_size=10, bbox_interval=-1, bucket='', cache_images=False, cfg='./models/yolov3.yaml', data='./data.yaml', device='', entity=None, epochs=100, evolve=False, exist_ok=False, global_rank=-1, hyp='data/hyp.scratch.yaml', image_weights=False, img_size=[416, 416], label_smoothing=0.0, linear_lr=False, local_rank=-1, multi_scale=False, name='exp', noautoanchor=False, nosave=False, notest=False, project='runs/train', quad=False, rect=False, resume=False, save_dir='runs/train/exp8', save_period=-1, single_cls=False, sync_bn=False, total_batch_size=10, upload_dataset=False, weights='yolov3.pt', workers=8, world_size=1)
tensorboard: Start with 'tensorboard --logdir runs/train', view at http://localhost:6006/
hyperparameters: lr0=0.01, lrf=0.2, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.5, cls_pw=1.0, obj=1.0, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0
wandb: Install Weights & Biases for YOLOv3 logging with 'pip install wandb' (recommended)

                 from  n    params  module                                  arguments                     
  0                -1  1       232  models.common.Conv                      [3, 8, 3, 1]                  
  1                -1  1       592  models.common.Conv                      [8, 8, 3, 2]                  
  2                -1  1       344  models.common.Bottleneck                [8, 8]                        
  3                -1  1       592  models.common.Conv                      [8, 8, 3, 2]                  
  4                -1  1       344  models.common.Bottleneck                [8, 8]                        
  5                -1  1       592  models.common.Conv                      [8, 8, 3, 2]                  
  6                -1  1       344  models.common.Bottleneck                [8, 8]                        
  7                -1  1      1184  models.common.Conv                      [8, 16, 3, 2]                 
  8                -1  1      1328  models.common.Bottleneck                [16, 16]                      
  9                -1  1      3504  models.common.Conv                      [16, 24, 3, 2]                
 10                -1  1      2952  models.common.Bottleneck                [24, 24]                      
 11                -1  1      2952  models.common.Bottleneck                [24, 24, False]               
 12                -1  1       416  models.common.Conv                      [24, 16, [1, 1]]              
 13                -1  1      3504  models.common.Conv                      [16, 24, 3, 1]                
 14                -1  1       416  models.common.Conv                      [24, 16, 1, 1]                
 15                -1  1      3504  models.common.Conv                      [16, 24, 3, 1]                
 16                -2  1       144  models.common.Conv                      [16, 8, 1, 1]                 
 17                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']          
 18           [-1, 8]  1         0  models.common.Concat                    [1]                           
 19                -1  1      1392  models.common.Bottleneck                [24, 16, False]               
 20                -1  1      1328  models.common.Bottleneck                [16, 16, False]               
 21                -1  1       144  models.common.Conv                      [16, 8, 1, 1]                 
 22                -1  1      1184  models.common.Conv                      [8, 16, 3, 1]                 
 23                -2  1        80  models.common.Conv                      [8, 8, 1, 1]                  
 24                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']          
 25           [-1, 6]  1         0  models.common.Concat                    [1]                           
 26                -1  1       376  models.common.Bottleneck                [16, 8, False]                
 27                -1  1       344  models.common.Bottleneck                [8, 8, False]                 
 28      [27, 22, 15]  1      1071  models.yolo.Detect                      [2, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [8, 16, 24]]
Model Summary: 157 layers, 28863 parameters, 28863 gradients

Transferred 24/212 items from yolov3.pt
Scaled weight_decay = 0.00046875
Optimizer groups: 37 .bias, 37 conv.weight, 34 other
val: Scanning 'valid/labels.cache' images and labels... 29 found,
val: Scanning 'valid/labels.cache' images and labels... 29 found,
Plotting labels... 
train: Scanning 'train/labels.cache' images and labels... 315 fou

autoanchor: Analyzing anchors... anchors/target = 5.46, Best Possible Recall (BPR) = 0.9984
Image sizes 416 train, 416 test
Using 4 dataloader workers
Logging results to runs/train/exp8
Starting training for 100 epochs...

     Epoch   gpu_mem       box       obj       cls     total    labels  img_size
      0/99        0G    0.1063   0.02055   0.02829    0.1551     
Traceback (most recent call last):
  File "train.py", line 541, in <module>
    train(hyp, opt, device, tb_writer)
  File "train.py", line 334, in train
    tb_writer.add_graph(torch.jit.trace(de_parallel(model), imgs, strict=False), [])  # model graph
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/jit/_trace.py", line 733, in trace
    return trace_module(
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/jit/_trace.py", line 934, in trace_module
    module._c._create_method_from_trace(
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 725, in _call_impl
    result = self._slow_forward(*input, **kwargs)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 709, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/Desktop/yolov3/models/yolo.py", line 121, in forward
    return self.forward_once(x, profile)  # single-scale inference, train
  File "/home/ubuntu/Desktop/yolov3/models/yolo.py", line 152, in forward_once
    x = m(x)  # run
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 725, in _call_impl
    result = self._slow_forward(*input, **kwargs)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 709, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/Desktop/yolov3/models/common.py", line 42, in forward
    return self.act(self.bn(self.conv(x)))
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 725, in _call_impl
    result = self._slow_forward(*input, **kwargs)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 709, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 423, in forward
    return self._conv_forward(input, self.weight)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 419, in _conv_forward
    return F.conv2d(input, weight, self.bias, self.stride,
RuntimeError: Cannot insert a Tensor that requires grad as a constant. Consider making it a parameter or input, or detaching the gradient
Tensor:
(1,1,.,.) = 
  0.0992 -0.0850 -0.0373
  0.0903 -0.1812  0.1154
 -0.0396  0.0979  0.0267

(2,1,.,.) = 
  0.1046 -0.1881  0.1193
  0.0538  0.1825  0.1271
 -0.1753 -0.1830 -0.0928

(3,1,.,.) = 
  0.1917  0.1543 -0.0090
 -0.1284  0.1172  0.0597
 -0.1244  0.1250  0.1168

(4,1,.,.) = 
 -0.1453  0.0117 -0.0328
  0.1130 -0.1115 -0.1711
  0.1400 -0.0285  0.1082

(5,1,.,.) = 
  0.1320  0.0399  0.0619
  0.1438  0.1825 -0.1277
  0.0241  0.1437  0.1394

(6,1,.,.) = 
  0.1367  0.1217  0.0497
 -0.1316 -0.1616 -0.0882
 -0.0224 -0.1180  0.0704

(7,1,.,.) = 
  0.0616  0.1246 -0.0996
  0.0417 -0.0701 -0.0432
 -0.1533 -0.0877 -0.0589

(8,1,.,.) = 
 -0.1041  0.0688 -0.0741
 -0.0904  0.0109  0.1393
 -0.1354  0.0904  0.1237

(1,2,.,.) = 
 0.01 *
 -2.3560  5.3375  0.9491
   7.0312 -7.5012 -1.4030
  -1.7319  2.7893 -0.0769

(2,2,.,.) = 
  0.1689 -0.0320  0.0823
 -0.0894  0.1888 -0.0814
  0.1443  0.0023 -0.1014

(3,2,.,.) = 
  0.1707 -0.1079 -0.0317
 -0.0037  0.0281 -0.1460
 -0.1366  0.1047 -0.0451

(4,2,.,.) = 
 0.01 *
  6.1859 -14.4287  3.8666
   4.6234 -12.8906 -9.1309
   6.5613  3.4485 -8.1848

(5,2,.,.) = 
  0.1196 -0.1393 -0.1385
 -0.1164  0.0242  0.1918
 -0.1216  0.1025 -0.1065

(6,2,.,.) = 
  0.0596 -0.0436  0.0740
  0.0622  0.1175  0.1296
 -0.0652  0.1880 -0.0222

(7,2,.,.) = 
  0.0823  0.0352  0.0475
  0.1921  0.1876  0.1312
  0.0061 -0.1332  0.1504

(8,2,.,.) = 
  0.1882 -0.1346  0.0466
 -0.1423  0.1643 -0.0746
  0.1159  0.0057 -0.0150

(1,3,.,.) = 
  0.1682  0.0599 -0.0717
 -0.1162 -0.0323 -0.0830
 -0.0617  0.0092  0.1147

(2,3,.,.) = 
  0.0989 -0.1022  0.0566
 -0.0556 -0.0211 -0.1851
 -0.0917  0.1044 -0.0468

(3,3,.,.) = 
 0.01 *
  9.3994  1.0971  6.3171
   4.2328  6.9946  9.5398
  -17.8223  9.6863 -13.5376

(4,3,.,.) = 
 -0.0583  0.1763 -0.0356
  0.1085  0.0833 -0.1244
 -0.1637  0.1847  0.0100

(5,3,.,.) = 
 -0.1809 -0.0409  0.1109
  0.1787 -0.1195  0.0418
  0.1660  0.1276  0.1199

(6,3,.,.) = 
 -0.0066 -0.1816 -0.1238
 -0.1124 -0.0823  0.1368
 -0.0629 -0.1438  0.0740

(7,3,.,.) = 
 -0.0481 -0.0156 -0.1658
 -0.0380 -0.1241  0.1769
 -0.1664 -0.1500 -0.0065

(8,3,.,.) = 
 -0.0061  0.0327  0.0907
  0.0309  0.0587 -0.1731
  0.1403  0.1678  0.1591
[ torch.FloatTensor{8,3,3,3} ]
glenn-jocher commented 3 years ago

@jaskiratsingh2000 it appears you may have environment problems. Please ensure you meet all dependency requirements if you are attempting to run YOLOv5 locally. If in doubt, create a new virtual Python 3.8 environment, clone the latest repo (code changes daily), and pip install -r requirements.txt again. We also highly recommend using one of our verified environments below.

Requirements

Python 3.8 or later with all requirements.txt dependencies installed, including torch>=1.7. To install run:

$ pip install -r requirements.txt

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

CI CPU testing

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are passing. These tests evaluate proper operation of basic YOLOv5 functionality, including training (train.py), testing (test.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu.

jaskiratsingh2000 commented 3 years ago

@glenn-jocher I am meeting all the requirements.

Also, I just checked that a similar kind of issue was referred here but not sure that commenting or uncommenting anything can solve this purpose. I am using Python 3.8.5 with Pytorch 0.7.0 and torchvision 0.8.1

jaskiratsingh2000 commented 3 years ago

Can you suggest edits like commenting or uncommenting whatever needed because everything seems to be okay in terms of requirements at my end.

Please let me know if there is any other possibility? @glenn-jocher

jaskiratsingh2000 commented 3 years ago

@glenn-jocher please let me know if possible that would be of great help

glenn-jocher commented 3 years ago

@jaskiratsingh2000 use a verified environment:

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

jaskiratsingh2000 commented 3 years ago

@glenn-jocher I totally understands that but my issue is that I don't want to run in verified environments but want to run on Rasoberry Pi.

Its quite sad for me to see that being a repo problem, there is to solution to that. Not even getting better response 😌

glenn-jocher commented 3 years ago

@jaskiratsingh2000 soon there will a good solution as Raspberry Pi is one of our YOLOv5 EXPORT Competition categories:

Compete and Win

We are super excited about our first-ever Ultralytics YOLOv5 πŸš€ EXPORT Competition with $10,000 in cash prizes!

jaskiratsingh2000 commented 3 years ago

Till when would that be possible?

Stepend commented 3 years ago

Is this problem solved now?I face the same question and all the requirements are satisfied!

glenn-jocher commented 3 years ago

@Stepend it appears you may have environment problems. Please ensure you meet all dependency requirements if you are attempting to run YOLOv5 locally. If in doubt, create a new virtual Python 3.8 environment, clone the latest repo (code changes daily), and pip install -r requirements.txt again. We also highly recommend using one of our verified environments below.

Requirements

Python 3.8 or later with all requirements.txt dependencies installed, including torch>=1.7. To install run:

$ pip install -r requirements.txt

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

CI CPU testing

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are passing. These tests evaluate proper operation of basic YOLOv5 functionality, including training (train.py), testing (test.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu.

github-actions[bot] commented 3 years ago

πŸ‘‹ Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv3 πŸš€ resources:

Access additional Ultralytics ⚑ resources:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv3 πŸš€ and Vision AI ⭐!