Closed Henry0528 closed 3 years ago
๐ Hello @Henry0528, thank you for your interest in ๐ YOLOv5! Please visit our โญ๏ธ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.
If this is a ๐ Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.
If this is a custom training โ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.
For business inquiries or professional support requests please visit https://www.ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com.
Python 3.8 or later with all requirements.txt dependencies installed, including torch>=1.7
. To install run:
$ pip install -r requirements.txt
YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), testing (test.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit.
@Henry0528 oh interesting! A table with the quantitative results would be cool.
Yes as you've seen GFLOPS only loosely correlates to speed sometimes. Correlation will vary by platform and backend, drivers etc.
@glenn-jocher Thank you for your comment.I've made a table about my results.The test speed(only inference,not including the time of nms) on GPU is almost the same,while testing on CPUhas some difference.l guess maybe ghost module is not well supported on GPU.l also find that the time of nms is different on CPU and GPU and doing nms on GPU is even more time-cosuming.
@Henry0528 hmm, your updates seem to work well!
The params are halved, which would help reduce package sizes for iOS/Android apps etc without losing any speed or accuracy. The FLOPS are halved also, though the speed impact doesn't quite reflect that same reduction. CPU shows a ~10% improvement.
I think the next step would be to test on larger models, i.e. probably by training a YOLOv5l6 model on COCO for 300 epochs to verify the improvements translate to larger models. If you want to share your yaml I could get a run started on one of our machines.
@glenn-jocher I'm a college student doing my graduation project which using yolov5 to do some detection in underwater sonar picture.l'm a freshman to yolo so I'm not sure whether my change is reasonable.I would be grateful if you could do some futher test on my work. models.zip
@Henry0528 thanks, I'll take a look!
@Henry0528 I get the following stats comparison below for YOLOv5s (baseline) and your changes (YOLOv5s-ghost). These numbers don't line up with your comment above https://github.com/ultralytics/yolov5/issues/3234#issuecomment-844614891 though. Can you double check the model you sent and your table values?
python models/yolo.py --cfg yolov5s.yaml
Model Summary: 283 layers, 7276605 parameters, 7276605 gradients, 17.1 GFLOPS
python models/yolo.py --cfg yolov5s-ghost.yaml
Model Summary: 323 layers, 5870101 parameters, 5870101 gradients, 14.1 GFLOPS
@glenn-jocher you should also change common.py like this and you may get the same result as me
@Henry0528 thanks! After your change my numbers line up with your table. Ok I'll try to train a YOLOv5s and YOLOv5m versions of the model to compare to the baselines.
python models/yolo.py --cfg yolov5s-ghost.yaml
Model Summary: 479 layers, 3706861 parameters, 3706861 gradients, 8.2 GFLOPS
@glenn-jocher I'm a college student doing my graduation project which using yolov5 to do some detection in underwater sonar picture.l'm a freshman to yolo so I'm not sure whether my change is reasonable.I would be grateful if you could do some futher test on my work. models.zip
I have tried ghost module just as you did in my custom dataset,but the mAP drops a lot .any idea to disscuss?
@15050188022 ็ไธช่็ณปๆนๅผ่่ๅ
@glenn-jocher I'm a college student doing my graduation project which using yolov5 to do some detection in underwater sonar picture.l'm a freshman to yolo so I'm not sure whether my change is reasonable.I would be grateful if you could do some futher test on my work. models.zip
I have tried ghost module just as you did in my custom dataset,but the mAP drops a lot .any idea to disscuss?
@15050188022 ็ไธช่็ณปๆนๅผ่่ๅ QQ 1570053804
@15050188022 @Henry0528 I started some YOLOv5s Ghost trainings in this public W&B project: https://wandb.ai/glenn-jocher/ghost
I've got two ghost trainings, and I'll add the baseline YOLOv5s model soon.
@glenn-jocher I've saw the training results,maybe the drop in mAP is acceptable๏ผ
@Henry0528 @15050188022 our Ghost study is finished in https://wandb.ai/glenn-jocher/ghost. I trained YOLOv5s default (master branch) and yolov5s.yaml
, yolov5s-ghost.yaml
with the L134 change mentioned in https://github.com/ultralytics/yolov5/issues/3234#issuecomment-846032730 in ghost
branch. Training results are below. So the accuracy is a bit less, though the params and flops are significantly reduced by almost half, so it seems to be a worthwhile compromise for some situations, like mobile app deployments that require small package sizes. Training memory and speed was about the same between all 3.
@glenn-jocher could you also check the test time of each model๏ผthough the flops are halved but the time cost has little difference on my own dataset
@Henry0528 @15050188022 YOLOv5s-ghost comparison here using
python test.py --weights yolov5s-ghost2.pt --data coco.yaml --img 640 --iou 0.65
YOLOv5 ๐ v5.0-116-gbb13123 torch 1.8.1+cu101 CUDA:0 (Tesla T4, 15109.75MB)
Model | size (pixels) |
mAPval 0.5:0.95 |
mAPtest 0.5:0.95 |
mAPval 0.5 |
Speed T4 (ms) |
params (M) |
FLOPS 640 (B) |
|
---|---|---|---|---|---|---|---|---|
YOLOv5s | 640 | 37.0 | - | 56.4 | 4.7 | 7.3 | 17.0 | |
YOLOv5s-ghost1 | 640 | 35.2 | - | 54.0 | 4.8 | 5.1 | 11.2 | |
YOLOv5s-ghost2 | 640 | 35.6 | - | 54.1 | 4.9 | 3.9 | 8.8 |
@glenn-jocher the results are strange.with lower params and flops,the time cost is longer
@Henry0528 no it's not strange. What we see in the results is mainly the effect of depthwise separable convolutions, mainly popularized by efficientnet/efficientdet architectures, which show the same lower params/FLOPS but slower inference trend when compared to full-convolutional (no groups) YOLO and ResNet architectures.
I think the results are interesting though, like I said everyone has different priorities, so for organizations prioritizing smaller packages these results may appeal.
@glenn-jocher thank you very much for your answer,Now my problem is solved
@Henry0528 yeah no problem! Of course keep in mind that the results will vary by backend, so CPU inference, ONNX, android, iOS inference, etc. may see different levels of improvement than CUDA.
@Henry0528 I was thinking about this some more. Most of the model parameters are in the P5 and P6 layers at the largest strides.
Perhaps we could employ ghost modules only in the largest output layer, thereby achieving the most size reduction while minimally impacting the accuracy.
The ghost modules applied to P3/8 modules in particular are achieving next to no size reduction since those parameters are already so few. For example we could apply ghost modules to stages 7, 8, 9 in the backbone and 23 in the head (last output layer). This might reduce our model size by maybe 20-30% with (hopefully) little to no accuracy/speed hits.
from n params module arguments
0 -1 1 3520 models.common.Focus [3, 32, 3]
1 -1 1 18560 models.common.Conv [32, 64, 3, 2]
2 -1 1 18816 models.common.C3 [64, 64, 1]
3 -1 1 73984 models.common.Conv [64, 128, 3, 2]
4 -1 1 156928 models.common.C3 [128, 128, 3]
5 -1 1 295424 models.common.Conv [128, 256, 3, 2]
6 -1 1 625152 models.common.C3 [256, 256, 3]
7 -1 1 1180672 models.common.Conv [256, 512, 3, 2]
8 -1 1 656896 models.common.SPP [512, 512, [5, 9, 13]]
9 -1 1 1182720 models.common.C3 [512, 512, 1, False]
10 -1 1 131584 models.common.Conv [512, 256, 1, 1]
11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
12 [-1, 6] 1 0 models.common.Concat [1]
13 -1 1 361984 models.common.C3 [512, 256, 1, False]
14 -1 1 33024 models.common.Conv [256, 128, 1, 1]
15 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
16 [-1, 4] 1 0 models.common.Concat [1]
17 -1 1 90880 models.common.C3 [256, 128, 1, False]
18 -1 1 147712 models.common.Conv [128, 128, 3, 2]
19 [-1, 14] 1 0 models.common.Concat [1]
20 -1 1 296448 models.common.C3 [256, 256, 1, False]
21 -1 1 590336 models.common.Conv [256, 256, 3, 2]
22 [-1, 10] 1 0 models.common.Concat [1]
23 -1 1 1182720 models.common.C3 [512, 512, 1, False]
24 [17, 20, 23] 1 229245 Detect [80, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512]]
Model Summary: 283 layers, 7276605 parameters, 7276605 gradients, 17.1 GFLOPs
@glenn-jocher How can I access the YAML file for the YOLOv5-ghost? I was not able to find that on this repo. Please let me know Thanks!
@jaskiratsingh2000 yamls are in https://github.com/ultralytics/yolov5/issues/3234#issuecomment-845140740 zip
Okay I got it. I just tried running the yolov5s-ghost and as you mentioned in previous issue comment that it is one of the smallest yolo version you have. I just checked it in Rpi and it showed the total time taken is 3986.6 ms which is one the highest out of yolov5, yolov5-tiny. So if that is the smallest version how it is taking more time? @glenn-jocher
@jaskiratsingh2000 profiling results are indicated in https://github.com/ultralytics/yolov5/issues/3234#issuecomment-850312297
@glenn-jocher even there as well it is showing that more time taken than normal yolov5 and yolov5-tiny. Can you check once?
@jaskiratsingh2000 I created the metrics in https://github.com/ultralytics/yolov5/issues/3234#issuecomment-850312297, I don't need to check anything, they are correct
@glenn-jocher I am not kinda lying but these are the results I am getting whenever I am trying to run the yolov5s-ghost.yaml
YOLOv5 ๐ v5.0-110-gae04192 torch 1.7.0a0+e85d494 CPU
from n params module arguments
0 -1 1 3520 models.common.Focus [3, 32, 3]
1 -1 1 10144 models.experimental.GhostConv [32, 64, 3, 2]
2 -1 1 9656 models.common.C3 [64, 64, 1]
3 -1 1 38720 models.experimental.GhostConv [64, 128, 3, 2]
4 -1 1 43600 models.common.C3 [128, 128, 3]
5 -1 1 151168 models.experimental.GhostConv [128, 256, 3, 2]
6 -1 1 165024 models.common.C3 [256, 256, 3]
7 -1 1 597248 models.experimental.GhostConv [256, 512, 3, 2]
8 -1 1 656896 models.common.SPP [512, 512, [5, 9, 13]]
9 -1 1 564672 models.common.C3 [512, 512, 1, False]
10 -1 1 69248 models.experimental.GhostConv [512, 256, 1, 1]
11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
12 [-1, 6] 1 0 models.common.Concat [1]
13 -1 1 208608 models.common.C3 [512, 256, 1, False]
14 -1 1 18240 models.experimental.GhostConv [256, 128, 1, 1]
15 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
16 [-1, 4] 1 0 models.common.Concat [1]
17 -1 1 53104 models.common.C3 [256, 128, 1, False]
18 -1 1 75584 models.experimental.GhostConv [128, 128, 3, 2]
19 [-1, 14] 1 0 models.common.Concat [1]
20 -1 1 143072 models.common.C3 [256, 256, 1, False]
21 -1 1 298624 models.experimental.GhostConv [256, 256, 3, 2]
22 [-1, 10] 1 0 models.common.Concat [1]
23 -1 1 564672 models.common.C3 [512, 512, 1, False]
24 [17, 20, 23] 1 35061 Detect [8, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512]]
Model Summary: 479 layers, 3706861 parameters, 3706861 gradients
time (ms) GFLOPS params module
118.30 0.00 3520 models.common.Focus
250.34 0.00 10144 models.experimental.GhostConv
274.37 0.00 9656 models.common.C3
279.72 0.00 38720 models.experimental.GhostConv
820.88 0.00 43600 models.common.C3
314.01 0.00 151168 models.experimental.GhostConv
1421.55 0.00 165024 models.common.C3
338.42 0.00 597248 models.experimental.GhostConv
129.29 0.00 656896 models.common.SPP
708.59 0.00 564672 models.common.C3
304.88 0.00 69248 models.experimental.GhostConv
1.11 0.00 0 torch.nn.modules.upsampling.Upsample
3.00 0.00 0 models.common.Concat
532.37 0.00 208608 models.common.C3
283.08 0.00 18240 models.experimental.GhostConv
1.33 0.00 0 torch.nn.modules.upsampling.Upsample
3.89 0.00 0 models.common.Concat
484.57 0.00 53104 models.common.C3
285.69 0.00 75584 models.experimental.GhostConv
0.66 0.00 0 models.common.Concat
619.82 0.00 143072 models.common.C3
303.81 0.00 298624 models.experimental.GhostConv
0.59 0.00 0 models.common.Concat
716.95 0.00 564672 models.common.C3
36.44 0.00 35061 Detect
8233.7ms total
That is why I am saying it repeatedly. @glenn-jocher please take a look or let me know if I am just missing out anything please.
If the yolov5s-ghost is the smallest version then it should give the lesser time out of all these but here it is giving more @glenn-jocher
I have also changed the line in common.py which was mentioned above but still the results are same for me. @glenn-jocher please let me know on this. I am kinda stuck now on this.
๐ Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.
Access additional YOLOv5 ๐ resources:
Access additional Ultralytics โก resources:
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!
Thank you for your contributions to YOLOv5 ๐ and Vision AI โญ!
Good news ๐! Your original issue may now be fixed โ
in PR #4412. This PR adds a new yolov5s-ghost.yaml
file to data/hub models to allow anyone to get started training Ghost models. You can start training this with:
python train.py --cfg yolov5s-ghost.yaml --weights yolov5s.pt
To receive this update:
git pull
from within your yolov5/
directory or git clone https://github.com/ultralytics/yolov5
againmodel = torch.hub.load('ultralytics/yolov5', 'yolov5s', force_reload=True)
sudo docker pull ultralytics/yolov5:latest
to update your image Thank you for spotting this issue and informing us of the problem. Please let us know if this update resolves the issue for you, and feel free to inform us of any other issues you discover or feature requests that come to mind. Happy trainings with YOLOv5 ๐!
โQuestion
l've recently changed the yolov5s models with ghost modules.I used ghostconv to replace conv and replaced the bottleneck in C3 with ghostbottleneck. I train with the new model successfully and get only a very small drop in map,while the new model costs about 8 GFlops which is half of the yolov5s model(16GFlops).But when l run the test.py and find that the time consume of testing is almost the same as the yolov5s model,sometime even longer.