How can I produce profiling(time for each layers) for yolov3-tiny?

jaskiratsingh2000 commented 3 years ago

Hi @glenn-jocher I am addressing the issue #3421 from yolov5 repo since I believe that this would be the correct place to discuss about that.

So I want to produce profiling just like below but for the yolov3-tiny. I know how to do for yolov5 but the steps for producing the yolov3-tiny are not there and even I don't see the separate repo to do that. So can you please let me know about this @glenn-jocher ?

The thing to produce for yolov3-tiny is below


YOLOv5 🚀 v5.0-100-g4a8d238 torch 1.7.0a0+e85d494 CPU

                 from  n    params  module                                  arguments                     
  0                -1  1      3520  models.common.Focus                     [3, 32, 3]                    
  1                -1  1     18560  models.common.Conv                      [32, 64, 3, 2]                
  2                -1  1     18816  models.common.C3                        [64, 64, 1]                   
  3                -1  1     73984  models.common.Conv                      [64, 128, 3, 2]               
  4                -1  1    156928  models.common.C3                        [128, 128, 3]                 
  5                -1  1    295424  models.common.Conv                      [128, 256, 3, 2]              
  6                -1  1    625152  models.common.C3                        [256, 256, 3]                 
  7                -1  1   1180672  models.common.Conv                      [256, 512, 3, 2]              
  8                -1  1    656896  models.common.SPP                       [512, 512, [5, 9, 13]]        
  9                -1  1   1182720  models.common.C3                        [512, 512, 1, False]          
 10                -1  1    131584  models.common.Conv                      [512, 256, 1, 1]              
 11                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']          
 12           [-1, 6]  1         0  models.common.Concat                    [1]                           
 13                -1  1    361984  models.common.C3                        [512, 256, 1, False]          
 14                -1  1     33024  models.common.Conv                      [256, 128, 1, 1]              
 15                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']          
 16           [-1, 4]  1         0  models.common.Concat                    [1]                           
 17                -1  1     90880  models.common.C3                        [256, 128, 1, False]          
 18                -1  1    147712  models.common.Conv                      [128, 128, 3, 2]              
 19          [-1, 14]  1         0  models.common.Concat                    [1]                           
 20                -1  1    296448  models.common.C3                        [256, 256, 1, False]          
 21                -1  1    590336  models.common.Conv                      [256, 256, 3, 2]              
 22          [-1, 10]  1         0  models.common.Concat                    [1]                           
 23                -1  1   1182720  models.common.C3                        [512, 512, 1, False]          
 24      [17, 20, 23]  1    229245  Detect                                  [80, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512]]
Model Summary: 283 layers, 7276605 parameters, 7276605 gradients

 time (ms)     GFLOPS     params  module
    180.91       0.00       3520  models.common.Focus
    153.11       0.00      18560  models.common.Conv
    321.00       0.00      18816  models.common.C3
     92.07       0.00      73984  models.common.Conv
    495.51       0.00     156928  models.common.C3
     74.00       0.00     295424  models.common.Conv
    403.86       0.00     625152  models.common.C3
     64.33       0.00    1180672  models.common.Conv
    162.37       0.00     656896  models.common.SPP
    183.09       0.00    1182720  models.common.C3
     37.09       0.00     131584  models.common.Conv
      5.72       0.00          0  torch.nn.modules.upsampling.Upsample
      9.12       0.00          0  models.common.Concat
    218.21       0.00     361984  models.common.C3
     28.29       0.00      33024  models.common.Conv
      6.74       0.00          0  torch.nn.modules.upsampling.Upsample
     12.23       0.00          0  models.common.Concat
    228.79       0.00      90880  models.common.C3
     50.36       0.00     147712  models.common.Conv
      5.35       0.00          0  models.common.Concat
    209.92       0.00     296448  models.common.C3
     43.30       0.00     590336  models.common.Conv
      0.35       0.00          0  models.common.Concat
    248.59       0.00    1182720  models.common.C3
     57.34       0.00     229245  Detect
3291.7ms total

github-actions[bot] commented 3 years ago

👋 Hello @jaskiratsingh2000, thank you for your interest in YOLOv3 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.

For business inquiries or professional support requests please visit https://www.ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com.

Requirements

Python 3.8 or later with all requirements.txt dependencies installed, including torch>=1.7. To install run:

$ pip install -r requirements.txt

Environments

YOLOv3 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Google Colab and Kaggle notebooks with free GPU:
Google Cloud Deep Learning VM. See GCP Quickstart Guide
Amazon Deep Learning AMI. See AWS Quickstart Guide
Docker Image. See Docker Quickstart Guide

Status

If this badge is green, all YOLOv3 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv3 training (train.py), testing (test.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

jaskiratsingh2000 commented 3 years ago

@glenn-jocher Can you please let me know about the above? I would highly appreciate your response. Thanks

jaskiratsingh2000 commented 3 years ago

@glenn-jocher Did you get a chance to check on the above? I am still looking forward to steps to produce the profiling for yolov3-tiny. Please let me know. It would be of great help.

Looking forward to hearing from you. Thanks!!

glenn-jocher commented 3 years ago

@jaskiratsingh2000 I don't understand your question. yolo.py accepts any yaml: https://github.com/ultralytics/yolov5/blob/3cb9ad4fc49872cf21ea529277708f1707649cbb/models/yolo.py#L287-L290

jaskiratsingh2000 commented 3 years ago

@glenn-jocher My question is that as you could see above I produced profiling and latency for each layer that is getting the time taken by each layer.

So I want to produce same for the yolov3-tiny version. How can I do that?

This is what I mean to ask exactly. Can you please let me know @glenn-jocher

Thanks!

glenn-jocher commented 3 years ago

@jaskiratsingh2000 you can profile any yaml you want with yolo.py. See yolo.py argparser for details.

jaskiratsingh2000 commented 3 years ago

Thanks for your reply @glenn-jocher So as you mentioned above.

So in the line arser.add_argument('--cfg', type=str, default='yolov3.yaml', help='model.yaml') within the yolo.py argparser https://github.com/ultralytics/yolov3/blob/ab7ff9dd4c8b8e5a2c282fee93e975887a91ff7b/models/yolo.py#L288

I changed the value of default to yolov3-tiny.yaml. default='yolov3-tiny.yaml' and I got these following results:

YOLOv3 🚀 v9.5.0-14-g327ecbf torch 1.8.1+cu102 CPU

                 from  n    params  module                                  arguments                     
  0                -1  1       464  models.common.Conv                      [3, 16, 3, 1]                 
  1                -1  1         0  torch.nn.modules.pooling.MaxPool2d      [2, 2, 0]                     
  2                -1  1      4672  models.common.Conv                      [16, 32, 3, 1]                
  3                -1  1         0  torch.nn.modules.pooling.MaxPool2d      [2, 2, 0]                     
  4                -1  1     18560  models.common.Conv                      [32, 64, 3, 1]                
  5                -1  1         0  torch.nn.modules.pooling.MaxPool2d      [2, 2, 0]                     
  6                -1  1     73984  models.common.Conv                      [64, 128, 3, 1]               
  7                -1  1         0  torch.nn.modules.pooling.MaxPool2d      [2, 2, 0]                     
  8                -1  1    295424  models.common.Conv                      [128, 256, 3, 1]              
  9                -1  1         0  torch.nn.modules.pooling.MaxPool2d      [2, 2, 0]                     
 10                -1  1   1180672  models.common.Conv                      [256, 512, 3, 1]              
 11                -1  1         0  torch.nn.modules.padding.ZeroPad2d      [[0, 1, 0, 1]]                
 12                -1  1         0  torch.nn.modules.pooling.MaxPool2d      [2, 1, 0]                     
 13                -1  1   4720640  models.common.Conv                      [512, 1024, 3, 1]             
 14                -1  1    262656  models.common.Conv                      [1024, 256, 1, 1]             
 15                -1  1   1180672  models.common.Conv                      [256, 512, 3, 1]              
 16                -2  1     33024  models.common.Conv                      [256, 128, 1, 1]              
 17                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']          
 18           [-1, 8]  1         0  models.common.Concat                    [1]                           
 19                -1  1    885248  models.common.Conv                      [384, 256, 3, 1]              
 20          [19, 15]  1    196350  Detect                                  [80, [[10, 14, 23, 27, 37, 58], [81, 82, 135, 169, 344, 319]], [256, 512]]
[W NNPACK.cpp:80] Could not initialize NNPACK! Reason: Unsupported hardware.
Model Summary: 59 layers, 8852366 parameters, 8852366 gradients, 13.3 GFLOPS

 time (ms)     GFLOPS     params  module
     10.55       0.10        464  models.common.Conv
     12.16       0.00          0  torch.nn.modules.pooling.MaxPool2d
     10.40       0.24       4672  models.common.Conv
      6.30       0.00          0  torch.nn.modules.pooling.MaxPool2d
      6.19       0.24      18560  models.common.Conv
      3.25       0.00          0  torch.nn.modules.pooling.MaxPool2d
      5.71       0.24      73984  models.common.Conv
      1.63       0.00          0  torch.nn.modules.pooling.MaxPool2d
      6.06       0.24     295424  models.common.Conv
      0.86       0.00          0  torch.nn.modules.pooling.MaxPool2d
      8.49       0.24    1180672  models.common.Conv
      0.14       0.00          0  torch.nn.modules.padding.ZeroPad2d
      1.55       0.00          0  torch.nn.modules.pooling.MaxPool2d
     25.10       0.94    4720640  models.common.Conv
      3.82       0.05     262656  models.common.Conv
      8.57       0.24    1180672  models.common.Conv
      1.36       0.01      33024  models.common.Conv
      0.34       0.00          0  torch.nn.modules.upsampling.Upsample
      0.08       0.00          0  models.common.Concat
     13.58       0.71     885248  models.common.Conv
      3.24       0.08     196350  Detect
129.4ms total

@glenn-jocher Now my Question is - Is this the profiling result for "yolov3-tiny"? If yes then, why it is showing yolov3 instead of yolov3-tiny here YOLOv3 🚀 v9.5.0-14-g327ecbf torch 1.8.1+cu102 CPU at the very first line in the above outcome?

Please let me know @glenn-jocher Looking forward to hearing your response. Thanks!

jaskiratsingh2000 commented 3 years ago

@glenn-jocher Please let me know if you get my above concern. I have raised the question above and would be glad to hear back your response. Thanks!

github-actions[bot] commented 3 years ago

👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv3 🚀 resources:

Wiki – https://github.com/ultralytics/yolov3/wiki
Tutorials – https://github.com/ultralytics/yolov3#tutorials
Docs – https://docs.ultralytics.com

Access additional Ultralytics ⚡ resources:

Ultralytics HUB – https://ultralytics.com/pricing
Vision API – https://ultralytics.com/yolov5
About Us – https://ultralytics.com/about
Join Our Team – https://ultralytics.com/work
Contact Us – https://ultralytics.com/contact

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv3 🚀 and Vision AI ⭐!

glenn-jocher commented 11 months ago

@jaskiratsingh2000 The model summary and the profiling results you shared seem to correspond to the YOLOv3-tiny configuration. The label "YOLOv3 🚀 v9.5.0-14-g327ecbf torch 1.8.1+cu102 CPU" at the very first line of the outcome refers to the YOLOv3 framework used for the YOLOv3-tiny model. The naming convention can be a little confusing, but rest assured that the profiling results are indeed for YOLOv3-tiny. If you have any further queries, feel free to ask.

ultralytics / yolov3