WongKinYiu / ScaledYOLOv4

Scaled-YOLOv4: Scaling Cross Stage Partial Network
GNU General Public License v3.0
2.02k stars 575 forks source link

RuntimeError: result type Float can't be cast to the desired output type long int #400

Closed GabrielFerrante closed 2 years ago

GabrielFerrante commented 2 years ago

Command used for training with single GPU:

python3 train.py --batch-size 16 --img 416 416 --data BRA-Dataset.yaml --cfg yolov4-p5.yaml --weights yolov4-p5_.pt --device 0 --name yolov4-p5BRA-Dataset --epochs 351

OUTPUT: Using CUDA device0 _CudaDeviceProperties(name='NVIDIA GeForce RTX 3060', total_memory=12046MB)

Namespace(adam=False, batch_size=16, bucket='', cache_images=False, cfg='yolov4-p5.yaml', data='BRA-Dataset.yaml', device='0', epochs=351, evolve=False, global_rank=-1, hyp='data/hyp.finetune.yaml', img_size=[416, 416], local_rank=-1, logdir='runs/', multi_scale=False, name='yolov4-p5BRA-Dataset', noautoanchor=False, nosave=False, notest=False, rect=False, resume=False, single_cls=False, sync_bn=False, total_batchsize=16, weights='yolov4-p5.pt', world_size=1) Start Tensorboard with "tensorboard --logdir runs/", view at http://localhost:6006/ Hyperparameters {'lr0': 0.01, 'momentum': 0.937, 'weight_decay': 0.0005, 'giou': 0.05, 'cls': 0.5, 'cls_pw': 1.0, 'obj': 1.0, 'obj_pw': 1.0, 'iou_t': 0.2, 'anchor_t': 4.0, 'fl_gamma': 0.0, 'hsv_h': 0.015, 'hsv_s': 0.7, 'hsv_v': 0.4, 'degrees': 0.0, 'translate': 0.5, 'scale': 0.8, 'shear': 0.0, 'perspective': 0.0, 'flipud': 0.0, 'fliplr': 0.5, 'mixup': 0.2}

             from  n    params  module                                  arguments                     

0 -1 1 928 models.common.Conv [3, 32, 3, 1]
1 -1 1 18560 models.common.Conv [32, 64, 3, 2]
2 -1 1 19904 models.common.BottleneckCSP [64, 64, 1]
3 -1 1 73984 models.common.Conv [64, 128, 3, 2]
4 -1 1 161152 models.common.BottleneckCSP [128, 128, 3]
5 -1 1 295424 models.common.Conv [128, 256, 3, 2]
6 -1 1 2614016 models.common.BottleneckCSP [256, 256, 15]
7 -1 1 1180672 models.common.Conv [256, 512, 3, 2]
8 -1 1 10438144 models.common.BottleneckCSP [512, 512, 15]
9 -1 1 4720640 models.common.Conv [512, 1024, 3, 2]
10 -1 1 20728832 models.common.BottleneckCSP [1024, 1024, 7]
11 -1 1 7610368 models.common.SPPCSP [1024, 512, 1]
12 -1 1 131584 models.common.Conv [512, 256, 1, 1]
13 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
14 8 1 131584 models.common.Conv [512, 256, 1, 1]
15 [-1, -2] 1 0 models.common.Concat [1]
16 -1 1 2298880 models.common.BottleneckCSP2 [512, 256, 3]
17 -1 1 33024 models.common.Conv [256, 128, 1, 1]
18 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
19 6 1 33024 models.common.Conv [256, 128, 1, 1]
20 [-1, -2] 1 0 models.common.Concat [1]
21 -1 1 576000 models.common.BottleneckCSP2 [256, 128, 3]
22 -1 1 295424 models.common.Conv [128, 256, 3, 1]
23 -2 1 295424 models.common.Conv [128, 256, 3, 2]
24 [-1, 16] 1 0 models.common.Concat [1]
25 -1 1 2298880 models.common.BottleneckCSP2 [512, 256, 3]
26 -1 1 1180672 models.common.Conv [256, 512, 3, 1]
27 -2 1 1180672 models.common.Conv [256, 512, 3, 2]
28 [-1, 11] 1 0 models.common.Concat [1]
29 -1 1 9185280 models.common.BottleneckCSP2 [1024, 512, 3]
30 -1 1 4720640 models.common.Conv [512, 1024, 3, 1]
31 [22, 26, 30] 1 71800 models.yolo.Detect [5, [[13, 17, 31, 25, 24, 51, 61, 45], [48, 102, 119, 96, 97, 189, 217, 184], [171, 384, 324, 451, 616, 618, 800, 800]], [256, 512, 1024]] Model Summary: 476 layers, 7.02955e+07 parameters, 7.02955e+07 gradients

Transferred 935/943 items from yolov4-p5_.pt Optimizer groups: 158 .bias, 163 conv.weight, 155 other Scanning labels ../labels/train.cache (1474 found, 0 missing, 0 empty, 0 duplicate, for 1474 images): 100%|███████████████████████████████████████████| 1474/1474 [00:00<00:00, 19051.33it/s] Scanning labels ../labels/val.cache (349 found, 0 missing, 0 empty, 0 duplicate, for 349 images): 100%|█████████████████████████████████████████████████| 349/349 [00:00<00:00, 19006.60it/s]

Analyzing anchors... anchors/target = 6.33, Best Possible Recall (BPR) = 1.0000 Image sizes 416 train, 416 test Using 8 dataloader workers Starting training for 351 epochs...

 Epoch   gpu_mem      GIoU       obj       cls     total   targets  img_size

0%| | 0/93 [00:03<?, ?it/s] Traceback (most recent call last): File "train.py", line 443, in train(hyp, opt, device, tb_writer) File "train.py", line 260, in train loss, loss_items = compute_loss(pred, targets.to(device), model) # scaled by batch_size File "/media/usp/DATA/GabrielSFerrante/PROJETO/DetectAnimalsInRoads/YoloV4Scaled-Model/ScaledYOLOv4/utils/general.py", line 446, in compute_loss tcls, tbox, indices, anchors = build_targets(p, targets, model) # targets File "/media/usp/DATA/GabrielSFerrante/PROJETO/DetectAnimalsInRoads/YoloV4Scaled-Model/ScaledYOLOv4/utils/general.py", line 556, in buildtargets indices.append((b, a, gj.clamp(0, gain[3]), gi.clamp_(0, gain[2]))) # image, anchor, grid indices RuntimeError: result type Float can't be cast to the desired output type long int

Why i have this error ? What a mean ? This error have a solution ?

Pabligme commented 2 years ago

I solved this using !pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio===0.7.2 -f https://download.pytorch.org/whl/torch_stable.html

Pabligme commented 1 year ago

Other way to solve that is changue line 512 of file utils/general.py from gain = torch.ones(7, device=targets.device) to gain = torch.ones(7, device=targets.device).long()