lucasjinreal / yolov7_d2

🔥🔥🔥🔥 (Earlier YOLOv7 not official one) YOLO with Transformers and Instance Segmentation, with TensorRT acceleration! 🔥🔥🔥
GNU General Public License v3.0
3.13k stars 483 forks source link

running yolov7 on colab #84

Closed ghost closed 2 years ago

ghost commented 2 years ago
Install mish-cuda to speed up training and inference. More importantly, replace the naive Mish with MishCuda will give a ~1.5G memory saving during training. [07/11 09:58:59 detectron2]: Arguments: Namespace(confidence_threshold=0.4, config_file='/content/yolov7/configs/coco/sparseinst/sparse_inst_r50vd_giam_aug.yaml', input='/content/yolov7/i7z08-9i4nh.flv', nms_threshold=0.6, opts=['MODEL.WEIGHTS', '/content/yolov7/weights/sparse_inst_r50vd_giam_aug_8bc5b3.pth'], output='/content/output.flv', wandb_entity=None, wandb_project=None, webcam=False) [07/11 09:59:09 fvcore.common.checkpoint]: [Checkpointer] Loading from /content/yolov7/weights/sparse_inst_r50vd_giam_aug_8bc5b3.pth ... [07/11 09:59:09 d2.checkpoint.c2_model_loading]: Following weights matched with model: Names in Model Names in Checkpoint Shapes
backbone.bn1.* backbone.bn1.{bias,running_mean,running_var,weight} (64,) (64,) (64,) (64,)
backbone.conv1.0.weight backbone.conv1.0.weight (32, 3, 3, 3)
backbone.conv1.1.* backbone.conv1.1.{bias,running_mean,running_var,weight} (32,) (32,) (32,) (32,)
backbone.conv1.3.weight backbone.conv1.3.weight (32, 32, 3, 3)
backbone.conv1.4.* backbone.conv1.4.{bias,running_mean,running_var,weight} (32,) (32,) (32,) (32,)
backbone.conv1.6.weight backbone.conv1.6.weight (64, 32, 3, 3)
backbone.layer1.0.bn1.* backbone.layer1.0.bn1.{bias,running_mean,running_var,weight} (64,) (64,) (64,) (64,)
backbone.layer1.0.bn2.* backbone.layer1.0.bn2.{bias,running_mean,running_var,weight} (64,) (64,) (64,) (64,)
backbone.layer1.0.bn3.* backbone.layer1.0.bn3.{bias,running_mean,running_var,weight} (256,) (256,) (256,) (256,)
backbone.layer1.0.conv1.weight backbone.layer1.0.conv1.weight (64, 64, 1, 1)
backbone.layer1.0.conv2.weight backbone.layer1.0.conv2.weight (64, 64, 3, 3)
backbone.layer1.0.conv3.weight backbone.layer1.0.conv3.weight (256, 64, 1, 1)
backbone.layer1.0.downsample.1.weight backbone.layer1.0.downsample.1.weight (256, 64, 1, 1)
backbone.layer1.0.downsample.2.* backbone.layer1.0.downsample.2.{bias,running_mean,running_var,weight} (256,) (256,) (256,) (256,)
backbone.layer1.1.bn1.* backbone.layer1.1.bn1.{bias,running_mean,running_var,weight} (64,) (64,) (64,) (64,)
backbone.layer1.1.bn2.* backbone.layer1.1.bn2.{bias,running_mean,running_var,weight} (64,) (64,) (64,) (64,)
backbone.layer1.1.bn3.* backbone.layer1.1.bn3.{bias,running_mean,running_var,weight} (256,) (256,) (256,) (256,)
backbone.layer1.1.conv1.weight backbone.layer1.1.conv1.weight (64, 256, 1, 1)
backbone.layer1.1.conv2.weight backbone.layer1.1.conv2.weight (64, 64, 3, 3)
backbone.layer1.1.conv3.weight backbone.layer1.1.conv3.weight (256, 64, 1, 1)
backbone.layer1.2.bn1.* backbone.layer1.2.bn1.{bias,running_mean,running_var,weight} (64,) (64,) (64,) (64,)
backbone.layer1.2.bn2.* backbone.layer1.2.bn2.{bias,running_mean,running_var,weight} (64,) (64,) (64,) (64,)
backbone.layer1.2.bn3.* backbone.layer1.2.bn3.{bias,running_mean,running_var,weight} (256,) (256,) (256,) (256,)
backbone.layer1.2.conv1.weight backbone.layer1.2.conv1.weight (64, 256, 1, 1)
backbone.layer1.2.conv2.weight backbone.layer1.2.conv2.weight (64, 64, 3, 3)
backbone.layer1.2.conv3.weight backbone.layer1.2.conv3.weight (256, 64, 1, 1)
backbone.layer2.0.bn1.* backbone.layer2.0.bn1.{bias,running_mean,running_var,weight} (128,) (128,) (128,) (128,)
backbone.layer2.0.bn2.* backbone.layer2.0.bn2.{bias,running_mean,running_var,weight} (128,) (128,) (128,) (128,)
backbone.layer2.0.bn3.* backbone.layer2.0.bn3.{bias,running_mean,running_var,weight} (512,) (512,) (512,) (512,)
backbone.layer2.0.conv1.weight backbone.layer2.0.conv1.weight (128, 256, 1, 1)
backbone.layer2.0.conv2.weight backbone.layer2.0.conv2.weight (128, 128, 3, 3)
backbone.layer2.0.conv3.weight backbone.layer2.0.conv3.weight (512, 128, 1, 1)
backbone.layer2.0.downsample.1.weight backbone.layer2.0.downsample.1.weight (512, 256, 1, 1)
backbone.layer2.0.downsample.2.* backbone.layer2.0.downsample.2.{bias,running_mean,running_var,weight} (512,) (512,) (512,) (512,)
backbone.layer2.1.bn1.* backbone.layer2.1.bn1.{bias,running_mean,running_var,weight} (128,) (128,) (128,) (128,)
backbone.layer2.1.bn2.* backbone.layer2.1.bn2.{bias,running_mean,running_var,weight} (128,) (128,) (128,) (128,)
backbone.layer2.1.bn3.* backbone.layer2.1.bn3.{bias,running_mean,running_var,weight} (512,) (512,) (512,) (512,)
backbone.layer2.1.conv1.weight backbone.layer2.1.conv1.weight (128, 512, 1, 1)
backbone.layer2.1.conv2.weight backbone.layer2.1.conv2.weight (128, 128, 3, 3)
backbone.layer2.1.conv3.weight backbone.layer2.1.conv3.weight (512, 128, 1, 1)
backbone.layer2.2.bn1.* backbone.layer2.2.bn1.{bias,running_mean,running_var,weight} (128,) (128,) (128,) (128,)
backbone.layer2.2.bn2.* backbone.layer2.2.bn2.{bias,running_mean,running_var,weight} (128,) (128,) (128,) (128,)
backbone.layer2.2.bn3.* backbone.layer2.2.bn3.{bias,running_mean,running_var,weight} (512,) (512,) (512,) (512,)
backbone.layer2.2.conv1.weight backbone.layer2.2.conv1.weight (128, 512, 1, 1)
backbone.layer2.2.conv2.weight backbone.layer2.2.conv2.weight (128, 128, 3, 3)
backbone.layer2.2.conv3.weight backbone.layer2.2.conv3.weight (512, 128, 1, 1)
backbone.layer2.3.bn1.* backbone.layer2.3.bn1.{bias,running_mean,running_var,weight} (128,) (128,) (128,) (128,)
backbone.layer2.3.bn2.* backbone.layer2.3.bn2.{bias,running_mean,running_var,weight} (128,) (128,) (128,) (128,)
backbone.layer2.3.bn3.* backbone.layer2.3.bn3.{bias,running_mean,running_var,weight} (512,) (512,) (512,) (512,)
backbone.layer2.3.conv1.weight backbone.layer2.3.conv1.weight (128, 512, 1, 1)
backbone.layer2.3.conv2.weight backbone.layer2.3.conv2.weight (128, 128, 3, 3)
backbone.layer2.3.conv3.weight backbone.layer2.3.conv3.weight (512, 128, 1, 1)
backbone.layer3.0.bn1.* backbone.layer3.0.bn1.{bias,running_mean,running_var,weight} (256,) (256,) (256,) (256,)
backbone.layer3.0.bn2.* backbone.layer3.0.bn2.{bias,running_mean,running_var,weight} (256,) (256,) (256,) (256,)
backbone.layer3.0.bn3.* backbone.layer3.0.bn3.{bias,running_mean,running_var,weight} (1024,) (1024,) (1024,) (1024,)
backbone.layer3.0.conv1.weight backbone.layer3.0.conv1.weight (256, 512, 1, 1)
backbone.layer3.0.conv2.weight backbone.layer3.0.conv2.weight (256, 256, 3, 3)
backbone.layer3.0.conv3.weight backbone.layer3.0.conv3.weight (1024, 256, 1, 1)
backbone.layer3.0.downsample.1.weight backbone.layer3.0.downsample.1.weight (1024, 512, 1, 1)
backbone.layer3.0.downsample.2.* backbone.layer3.0.downsample.2.{bias,running_mean,running_var,weight} (1024,) (1024,) (1024,) (1024,)
backbone.layer3.1.bn1.* backbone.layer3.1.bn1.{bias,running_mean,running_var,weight} (256,) (256,) (256,) (256,)
backbone.layer3.1.bn2.* backbone.layer3.1.bn2.{bias,running_mean,running_var,weight} (256,) (256,) (256,) (256,)
backbone.layer3.1.bn3.* backbone.layer3.1.bn3.{bias,running_mean,running_var,weight} (1024,) (1024,) (1024,) (1024,)
backbone.layer3.1.conv1.weight backbone.layer3.1.conv1.weight (256, 1024, 1, 1)
backbone.layer3.1.conv2.weight backbone.layer3.1.conv2.weight (256, 256, 3, 3)
backbone.layer3.1.conv3.weight backbone.layer3.1.conv3.weight (1024, 256, 1, 1)
backbone.layer3.2.bn1.* backbone.layer3.2.bn1.{bias,running_mean,running_var,weight} (256,) (256,) (256,) (256,)
backbone.layer3.2.bn2.* backbone.layer3.2.bn2.{bias,running_mean,running_var,weight} (256,) (256,) (256,) (256,)
backbone.layer3.2.bn3.* backbone.layer3.2.bn3.{bias,running_mean,running_var,weight} (1024,) (1024,) (1024,) (1024,)
backbone.layer3.2.conv1.weight backbone.layer3.2.conv1.weight (256, 1024, 1, 1)
backbone.layer3.2.conv2.weight backbone.layer3.2.conv2.weight (256, 256, 3, 3)
backbone.layer3.2.conv3.weight backbone.layer3.2.conv3.weight (1024, 256, 1, 1)
backbone.layer3.3.bn1.* backbone.layer3.3.bn1.{bias,running_mean,running_var,weight} (256,) (256,) (256,) (256,)
backbone.layer3.3.bn2.* backbone.layer3.3.bn2.{bias,running_mean,running_var,weight} (256,) (256,) (256,) (256,)
backbone.layer3.3.bn3.* backbone.layer3.3.bn3.{bias,running_mean,running_var,weight} (1024,) (1024,) (1024,) (1024,)
backbone.layer3.3.conv1.weight backbone.layer3.3.conv1.weight (256, 1024, 1, 1)
backbone.layer3.3.conv2.weight backbone.layer3.3.conv2.weight (256, 256, 3, 3)
backbone.layer3.3.conv3.weight backbone.layer3.3.conv3.weight (1024, 256, 1, 1)
backbone.layer3.4.bn1.* backbone.layer3.4.bn1.{bias,running_mean,running_var,weight} (256,) (256,) (256,) (256,)
backbone.layer3.4.bn2.* backbone.layer3.4.bn2.{bias,running_mean,running_var,weight} (256,) (256,) (256,) (256,)
backbone.layer3.4.bn3.* backbone.layer3.4.bn3.{bias,running_mean,running_var,weight} (1024,) (1024,) (1024,) (1024,)
backbone.layer3.4.conv1.weight backbone.layer3.4.conv1.weight (256, 1024, 1, 1)
backbone.layer3.4.conv2.weight backbone.layer3.4.conv2.weight (256, 256, 3, 3)
backbone.layer3.4.conv3.weight backbone.layer3.4.conv3.weight (1024, 256, 1, 1)
backbone.layer3.5.bn1.* backbone.layer3.5.bn1.{bias,running_mean,running_var,weight} (256,) (256,) (256,) (256,)
backbone.layer3.5.bn2.* backbone.layer3.5.bn2.{bias,running_mean,running_var,weight} (256,) (256,) (256,) (256,)
backbone.layer3.5.bn3.* backbone.layer3.5.bn3.{bias,running_mean,running_var,weight} (1024,) (1024,) (1024,) (1024,)
backbone.layer3.5.conv1.weight backbone.layer3.5.conv1.weight (256, 1024, 1, 1)
backbone.layer3.5.conv2.weight backbone.layer3.5.conv2.weight (256, 256, 3, 3)
backbone.layer3.5.conv3.weight backbone.layer3.5.conv3.weight (1024, 256, 1, 1)
backbone.layer4.0.bn1.* backbone.layer4.0.bn1.{bias,running_mean,running_var,weight} (512,) (512,) (512,) (512,)
backbone.layer4.0.bn2.* backbone.layer4.0.bn2.{bias,running_mean,running_var,weight} (512,) (512,) (512,) (512,)
backbone.layer4.0.bn3.* backbone.layer4.0.bn3.{bias,running_mean,running_var,weight} (2048,) (2048,) (2048,) (2048,)
backbone.layer4.0.conv1.weight backbone.layer4.0.conv1.weight (512, 1024, 1, 1)
backbone.layer4.0.conv2.weight backbone.layer4.0.conv2.weight (512, 512, 3, 3)
backbone.layer4.0.conv3.weight backbone.layer4.0.conv3.weight (2048, 512, 1, 1)
backbone.layer4.0.downsample.1.weight backbone.layer4.0.downsample.1.weight (2048, 1024, 1, 1)
backbone.layer4.0.downsample.2.* backbone.layer4.0.downsample.2.{bias,running_mean,running_var,weight} (2048,) (2048,) (2048,) (2048,)
backbone.layer4.1.bn1.* backbone.layer4.1.bn1.{bias,running_mean,running_var,weight} (512,) (512,) (512,) (512,)
backbone.layer4.1.bn2.* backbone.layer4.1.bn2.{bias,running_mean,running_var,weight} (512,) (512,) (512,) (512,)
backbone.layer4.1.bn3.* backbone.layer4.1.bn3.{bias,running_mean,running_var,weight} (2048,) (2048,) (2048,) (2048,)
backbone.layer4.1.conv1.weight backbone.layer4.1.conv1.weight (512, 2048, 1, 1)
backbone.layer4.1.conv2.weight backbone.layer4.1.conv2.weight (512, 512, 3, 3)
backbone.layer4.1.conv3.weight backbone.layer4.1.conv3.weight (2048, 512, 1, 1)
backbone.layer4.2.bn1.* backbone.layer4.2.bn1.{bias,running_mean,running_var,weight} (512,) (512,) (512,) (512,)
backbone.layer4.2.bn2.* backbone.layer4.2.bn2.{bias,running_mean,running_var,weight} (512,) (512,) (512,) (512,)
backbone.layer4.2.bn3.* backbone.layer4.2.bn3.{bias,running_mean,running_var,weight} (2048,) (2048,) (2048,) (2048,)
backbone.layer4.2.conv1.weight backbone.layer4.2.conv1.weight (512, 2048, 1, 1)
backbone.layer4.2.conv2.weight backbone.layer4.2.conv2.weight (512, 512, 3, 3)
backbone.layer4.2.conv3.weight backbone.layer4.2.conv3.weight (2048, 512, 1, 1)
decoder.inst_branch.cls_score.* decoder.inst_branch.cls_score.{bias,weight} (80,) (80,1024)
decoder.inst_branch.fc.* decoder.inst_branch.fc.{bias,weight} (1024,) (1024,1024)
decoder.inst_branch.iam_conv.* decoder.inst_branch.iam_conv.{bias,weight} (400,) (400,64,3,3)
decoder.inst_branch.inst_convs.0.* decoder.inst_branch.inst_convs.0.{bias,weight} (256,) (256,258,3,3)
decoder.inst_branch.inst_convs.2.* decoder.inst_branch.inst_convs.2.{bias,weight} (256,) (256,256,3,3)
decoder.inst_branch.inst_convs.4.* decoder.inst_branch.inst_convs.4.{bias,weight} (256,) (256,256,3,3)
decoder.inst_branch.inst_convs.6.* decoder.inst_branch.inst_convs.6.{bias,weight} (256,) (256,256,3,3)
decoder.inst_branch.mask_kernel.* decoder.inst_branch.mask_kernel.{bias,weight} (128,) (128,1024)
decoder.inst_branch.objectness.* decoder.inst_branch.objectness.{bias,weight} (1,) (1,1024)
decoder.mask_branch.mask_convs.0.* decoder.mask_branch.mask_convs.0.{bias,weight} (256,) (256,258,3,3)
decoder.mask_branch.mask_convs.2.* decoder.mask_branch.mask_convs.2.{bias,weight} (256,) (256,256,3,3)
decoder.mask_branch.mask_convs.4.* decoder.mask_branch.mask_convs.4.{bias,weight} (256,) (256,256,3,3)
decoder.mask_branch.mask_convs.6.* decoder.mask_branch.mask_convs.6.{bias,weight} (256,) (256,256,3,3)
decoder.mask_branch.projection.* decoder.mask_branch.projection.{bias,weight} (128,) (128,256,1,1)
encoder.fpn_laterals.0.* encoder.fpn_laterals.0.{bias,weight} (256,) (256,2048,1,1)
encoder.fpn_laterals.1.* encoder.fpn_laterals.1.{bias,weight} (256,) (256,1024,1,1)
encoder.fpn_laterals.2.* encoder.fpn_laterals.2.{bias,weight} (256,) (256,512,1,1)
encoder.fpn_outputs.0.* encoder.fpn_outputs.0.{bias,weight} (256,) (256,256,3,3)
encoder.fpn_outputs.1.* encoder.fpn_outputs.1.{bias,weight} (256,) (256,256,3,3)
encoder.fpn_outputs.2.* encoder.fpn_outputs.2.{bias,weight} (256,) (256,256,3,3)
encoder.fusion.* encoder.fusion.{bias,weight} (256,) (256,768,1,1)
encoder.ppm.bottleneck.* encoder.ppm.bottleneck.{bias,weight} (256,) (256,512,1,1)
encoder.ppm.stages.0.1.* encoder.ppm.stages.0.1.{bias,weight} (64,) (64,256,1,1)
encoder.ppm.stages.1.1.* encoder.ppm.stages.1.1.{bias,weight} (64,) (64,256,1,1)
encoder.ppm.stages.2.1.* encoder.ppm.stages.2.1.{bias,weight} (64,) (64,256,1,1)
encoder.ppm.stages.3.1.* encoder.ppm.stages.3.1.{bias,weight} (64,) (64,256,1,1)

640 640 600 confidence thresh: 0.4 OpenCV: FFMPEG: tag 0x44495658/'XVID' is not supported with codec id 12 and format 'mp4 / MP4 (MPEG-4 Part 14)' OpenCV: FFMPEG: fallback to use tag 0x7634706d/'mp4v' 0% 0/1 [00:00<?, ?it/s] Traceback (most recent call last): File "/content/yolov7/demo.py", line 238, in cv2.imshow(out_path, res) NameError: name 'res' is not defined

I'm trying to run this code on Colab,but I got this error, could you please give me some advice?

lucasjinreal commented 2 years ago

@EtherealO try image file path rather than video, although I think video is OK, you should debug a little bit in code I have no problem on my side.

ghost commented 2 years ago

thanks a lot for that , I 'll try later.

avtregubov commented 1 year ago

have you solved this problem?