Open 11061995 opened 4 months ago
Is there any error log when the program is killed?
Namespace(cmd='quantize', weight='yolov5s.pt', cocodir='datasets/coco', device='cuda:0', ignore_policy='None', ptq='ptq.pt', qat='qat.pt', supervision_stride=1, iters=200, eval_origin=True, eval_ptq=True, all_node_with_qdq=True)
from n params module arguments
0 -1 1 3520 models.common.Conv [3, 32, 6, 2, 2]
1 -1 1 18560 models.common.Conv [32, 64, 3, 2]
2 -1 1 18816 models.common.C3 [64, 64, 1]
3 -1 1 73984 models.common.Conv [64, 128, 3, 2]
4 -1 2 115712 models.common.C3 [128, 128, 2]
5 -1 1 295424 models.common.Conv [128, 256, 3, 2]
6 -1 3 625152 models.common.C3 [256, 256, 3]
7 -1 1 1180672 models.common.Conv [256, 512, 3, 2]
8 -1 1 1182720 models.common.C3 [512, 512, 1]
9 -1 1 656896 models.common.SPPF [512, 512, 5]
10 -1 1 131584 models.common.Conv [512, 256, 1, 1]
11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
12 [-1, 6] 1 0 models.common.Concat [1]
13 -1 1 361984 models.common.C3 [512, 256, 1, False]
14 -1 1 33024 models.common.Conv [256, 128, 1, 1]
15 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
16 [-1, 4] 1 0 models.common.Concat [1]
17 -1 1 90880 models.common.C3 [256, 128, 1, False]
18 -1 1 147712 models.common.Conv [128, 128, 3, 2]
19 [-1, 14] 1 0 models.common.Concat [1]
20 -1 1 296448 models.common.C3 [256, 256, 1, False]
21 -1 1 590336 models.common.Conv [256, 256, 3, 2]
22 [-1, 10] 1 0 models.common.Concat [1]
23 -1 1 1182720 models.common.C3 [512, 512, 1, False]
24 [17, 20, 23] 1 229245 models.yolo.Detect [80, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512]]
YOLOv5s summary: 223 layers, 7235389 parameters, 7235389 gradients, 16.6 GFLOPs
Fusing layers... YOLOv5s summary: 166 layers, 7225885 parameters, 229245 gradients, 16.4 GFLOPs Scanning datasets/coco/train2017.cache... 117266 images, 1021 backgrounds, 0 corrupt: 100%|██████████| 118287/118287 00:00 WARNING ⚠️ datasets/coco/images/train2017/000000099844.jpg: 2 duplicate labels removed WARNING ⚠️ datasets/coco/images/train2017/000000201706.jpg: 1 duplicate labels removed WARNING ⚠️ datasets/coco/images/train2017/000000214087.jpg: 1 duplicate labels removed WARNING ⚠️ datasets/coco/images/train2017/000000522365.jpg: 1 duplicate labels removed Scanning datasets/coco/val2017.cache... 4952 images, 48 backgrounds, 0 corrupt: 100%|██████████| 5000/5000 00:00 Add QuantAdd to model.2.m.0 Add QuantAdd to model.4.m.0 Add QuantAdd to model.4.m.1 Add QuantAdd to model.6.m.0 Add QuantAdd to model.6.m.1 Add QuantAdd to model.6.m.2 Add QuantAdd to model.8.m.0 Collect stats for calibrating: 100%|████████████████████████████████████████████████████████████████| 25/25 [01:29<00:00, 3.58s/it] Evaluate Origin... Class Images Instances P R mAP50 mAP50-95: 100%|██████████| 500/500 06:09 all 5000 36335 0.664 0.516 0.562 0.372
Evaluating pycocotools mAP... saving _predictions.json... loading annotations into memory... Done (t=1.19s) creating index... index created! Loading and preparing results... Killed
This is all i got
When ever i am running this command python scripts/qat.py quantize yolov5s.pt --ptq=ptq.pt --qat=qat.pt --cocodir=datasets/coco --eval-ptq --eval-origin --all-node-with-qdq program get killed after 5 epochs. system conf: CUDA12.2 python 3.10 torch2.3.