Open sleipnier opened 1 year ago
Can you provide the val.py script and the vitis ai version from the docker image?
Can you provide the val.py script and the vitis ai version from the docker image?
The val.py script is given below:
python val.py --data ./data/coco.yaml --weights ./float/yolov5n_float.pt --batch-size 16 --imgsz 640 --conf-thres 0.5 --iou-thres 0.65 --quant_mode calib --nndct_quant
python val.py --data ./data/coco.yaml --weights ./float/yolov5n_float.pt --batch-size 1 --imgsz 640 --conf-thres 0.5 --iou-thres 0.65 --quant_mode test --nndct_quant
python val.py --data ./data/coco.yaml --weights ./float/yolov5n_float.pt --batch-size 1 --imgsz 640 --conf-thres 0.5 --iou-thres 0.65 --quant_mode test --nndct_quant --dump_xmodel
And the vitis ai version from the docker image is: xilinx/vitis-ai-pytorch-cpu:ubuntu2004-3.0.0.106
It would be very nice of you to help me to solve this problem. If there is any question, do not hesitate to contact me.
The val.py as in the script the python file not the CLI
The val.py as in the script the python file not the CLI
o,sorry I misunderstand your question. The val.py file is given below,
# Copyright 2019 Xilinx Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# YOLOv5 π by Ultralytics, GPL-3.0 license
"""
Validate a trained YOLOv5 model accuracy on a custom dataset
Usage:
$ python path/to/val.py --data coco128.yaml --weights yolov5s.pt --img 640
"""
import argparse
import json
import os
import sys
from pathlib import Path
from threading import Thread
from functools import partial
import numpy as np
import torch
from tqdm import tqdm
FILE = Path(__file__).resolve()
ROOT = FILE.parents[0] # YOLOv5 root directory
if str(ROOT) not in sys.path:
sys.path.append(str(ROOT)) # add ROOT to PATH
ROOT = Path(os.path.relpath(ROOT, Path.cwd())) # relative
from models.experimental import attempt_load
from utils.datasets import create_dataloader
from utils.general import coco80_to_coco91_class, check_dataset, check_img_size, check_requirements, \
check_suffix, check_yaml, box_iou, non_max_suppression, scale_coords, xyxy2xywh, xywh2xyxy, set_logging, \
increment_path, colorstr, print_args
from utils.metrics import ap_per_class, ConfusionMatrix
from utils.plots import output_to_target, plot_images, plot_val_study
from utils.torch_utils import select_device, time_sync
from utils.callbacks import Callbacks
def save_one_txt(predn, save_conf, shape, file):
# Save one txt result
gn = torch.tensor(shape)[[1, 0, 1, 0]] # normalization gain whwh
for *xyxy, conf, cls in predn.tolist():
xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist() # normalized xywh
line = (cls, *xywh, conf) if save_conf else (cls, *xywh) # label format
with open(file, 'a') as f:
f.write(('%g ' * len(line)).rstrip() % line + '\n')
def save_one_json(predn, jdict, path, class_map):
# Save one JSON result {"image_id": 42, "category_id": 18, "bbox": [258.15, 41.29, 348.26, 243.78], "score": 0.236}
image_id = int(path.stem) if path.stem.isnumeric() else path.stem
box = xyxy2xywh(predn[:, :4]) # xywh
box[:, :2] -= box[:, 2:] / 2 # xy center to top-left corner
for p, b in zip(predn.tolist(), box.tolist()):
jdict.append({'image_id': image_id,
'category_id': class_map[int(p[5])],
'bbox': [round(x, 3) for x in b],
'score': round(p[4], 5)})
def process_batch(detections, labels, iouv):
"""
Return correct predictions matrix. Both sets of boxes are in (x1, y1, x2, y2) format.
Arguments:
detections (Array[N, 6]), x1, y1, x2, y2, conf, class
labels (Array[M, 5]), class, x1, y1, x2, y2
Returns:
correct (Array[N, 10]), for 10 IoU levels
"""
correct = torch.zeros(detections.shape[0], iouv.shape[0], dtype=torch.bool, device=iouv.device)
iou = box_iou(labels[:, 1:], detections[:, :4])
x = torch.where((iou >= iouv[0]) & (labels[:, 0:1] == detections[:, 5])) # IoU above threshold and classes match
if x[0].shape[0]:
matches = torch.cat((torch.stack(x, 1), iou[x[0], x[1]][:, None]), 1).cpu().numpy() # [label, detection, iou]
if x[0].shape[0] > 1:
matches = matches[matches[:, 2].argsort()[::-1]]
matches = matches[np.unique(matches[:, 1], return_index=True)[1]]
# matches = matches[matches[:, 2].argsort()[::-1]]
matches = matches[np.unique(matches[:, 0], return_index=True)[1]]
matches = torch.Tensor(matches).to(iouv.device)
correct[matches[:, 1].long()] = matches[:, 2:3] >= iouv
return correct
def run(data,
weights=None, # model.pt path(s)
batch_size=32, # batch size
imgsz=640, # inference size (pixels)
conf_thres=0.001, # confidence threshold
iou_thres=0.6, # NMS IoU threshold
task='val', # train, val, test, speed or study
device='', # cuda device, i.e. 0 or 0,1,2,3 or cpu
single_cls=False, # treat as single-class dataset
augment=False, # augmented inference
verbose=False, # verbose output
save_txt=False, # save results to *.txt
save_hybrid=False, # save label+prediction hybrid results to *.txt
save_conf=False, # save confidences in --save-txt labels
save_json=False, # save a COCO-JSON results file
project=ROOT / 'runs/val', # save to project/name
name='exp', # save to project/name
exist_ok=False, # existing project/name ok, do not increment
half=True, # use FP16 half-precision inference
nndct_quant=False,
nndct_bitwidth=8,
model=None,
dataloader=None,
save_dir=Path(''),
plots=True,
callbacks=Callbacks(),
compute_loss=None,
quant_mode='calib',
dump_xmodel=False,
nndct_stat=0,
):
# Initialize/load model and set device
training = model is not None
if nndct_quant:
os.environ["W_QUANT"] = "1"
assert half is False and augment is False, "Invalid settings for nndct quant"
if dump_xmodel:
assert quant_mode == 'test', "Quant model should be 'test' for dumping xmodel"
assert batch_size == 1, "Dump xmodel only support batch size 1"
if training and not nndct_quant: # called by train.py
device = next(model.parameters()).device # get model device
else: # called directly
device = select_device(device, batch_size=batch_size)
# Directories
save_dir = increment_path(Path(project) / name, exist_ok=exist_ok) # increment run
(save_dir / 'labels' if save_txt else save_dir).mkdir(parents=True, exist_ok=True) # make dir
# Load model
if training:
device = next(model.parameters()).device # get model device
else:
check_suffix(weights, '.pt')
model = attempt_load(weights, map_location=device, fuse=not nndct_quant, force_reexport_deployable_model=not training, imgsz=imgsz) # load FP32 model
# Data
data = check_dataset(data) # check
gs = max(int(model.stride.max()), 32) # grid size (max stride)
imgsz = check_img_size(imgsz, s=gs) # check image size
# Multi-GPU disabled, incompatible with .half() https://github.com/ultralytics/yolov5/issues/99
# if device.type != 'cpu' and torch.cuda.device_count() > 1:
# model = nn.DataParallel(model)
# Half
half &= device.type != 'cpu' # half precision only supported on CUDA
model.half() if half else model.float()
# Configure
model.eval()
is_coco = isinstance(data.get('val'), str) and data['val'].endswith('coco/val2017.txt') # COCO dataset
nc = 1 if single_cls else int(data['nc']) # number of classes
iouv = torch.linspace(0.5, 0.95, 10).to(device) # iou vector for mAP@0.5:0.95
niou = iouv.numel()
# Dataloader
if not training:
if device.type != 'cpu':
model(torch.zeros(1, 3, imgsz, imgsz).to(device).type_as(next(model.parameters()))) # run once pad = 0.0 if task == 'speed' else 0.5
pad = 0.0 if task == 'speed' else 0.5
task = task if task in ('train', 'val', 'test') else 'val' # path to train/val/test images
dataloader = create_dataloader(data[task], imgsz, batch_size, gs, single_cls, pad=pad, rect=not nndct_quant,
prefix=colorstr(f'{task}: '), workers=8)[0]
seen = 0
confusion_matrix = ConfusionMatrix(nc=nc)
names = {k: v for k, v in enumerate(model.names if hasattr(model, 'names') else model.module.names)}
class_map = coco80_to_coco91_class() if is_coco else list(range(1000))
s = ('%20s' + '%11s' * 6) % ('Class', 'Images', 'Labels', 'P', 'R', 'mAP@.5', 'mAP@.5:.95')
dt, p, r, f1, mp, mr, map50, map = [0.0, 0.0, 0.0], 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0
loss = torch.zeros(3, device=device)
jdict, stats, ap, ap_class = [], [], [], []
if nndct_quant:
from pytorch_nndct.apis import torch_quantizer
import pytorch_nndct as py_nndct
from nndct_shared.utils import NndctOption
from nndct_shared.base import key_names, NNDCT_KEYS, NNDCT_DEBUG_LVL, GLOBAL_MAP, NNDCT_OP
import nndct_shared.quantization as nndct_quant
from pytorch_nndct.quantization import torchquantizer
input_tensor = (torch.randn(1, 3, imgsz, imgsz).to(device).type_as(next(model.parameters())))
model.forward = partial(model.forward, quant=True)
if training:
assert quant_mode == 'test'
output_dir = weights
else:
w = Path(weights[0] if isinstance(weights, list) else weights)
output_dir = w.parent / 'nndct_quant'
print(f"NNDCT quant dir: {output_dir}")
quantizer = torch_quantizer(quant_mode=quant_mode,
bitwidth=nndct_bitwidth,
module=model,
input_args=input_tensor,
output_dir=output_dir.as_posix() )
# if (NndctOption.nndct_stat.value > 2):
# def do_quantize(instance, blob, name, node=None, tensor_type='input'):
# # forward quant graph but not quantize parameter and activation
# if NndctOption.nndct_quant_off.value:
# return blob
# blob_save = None
# if isinstance(blob.values, torch.Tensor):
# blob_save = blob
# blob = blob.values.data
# quant_device = GLOBAL_MAP.get_ele(NNDCT_KEYS.QUANT_DEVICE)
# if blob.device.type != quant_device.type:
# raise TypeError(
# "Device of quantizer is {}, device of model and data should match device of quantizer".format(
# quant_device.type))
# if (NndctOption.nndct_stat.value > 2):
# quant_data = nndct_quant.QuantizeData(name, blob.cpu().detach().numpy())
# # quantize the tensor
# bnfp = instance.get_bnfp(name, True, tensor_type)
# if (NndctOption.nndct_stat.value > 1):
# print('---- quant %s tensor: %s with 1/step = %g' % (
# tensor_type, name, bnfp[1]))
# # hardware cut method
# mth = 4 if instance.lstm else 2
# if tensor_type == 'param':
# mth = 3
# res = py_nndct.nn.NndctFixNeuron(blob,
# blob,
# maxamp=[bnfp[0], bnfp[1]],
# method=mth)
# if (NndctOption.nndct_stat.value > 2):
# quant_efficiency, sqnr = quant_data.quant_efficiency(blob.cpu().detach().numpy(), 8)
# torchquantizer.global_snr_inv += 1 / sqnr
# print(f"quant_efficiency={quant_efficiency}, global_snr_inv={torchquantizer.global_snr_inv} {quant_data._name}\n")
# # update param to nndct graph
# if tensor_type == 'param':
# instance.update_param_to_nndct(node, name, res.cpu().detach().numpy())
# if blob_save is not None:
# blob_save.values.data = blob
# blob = blob_save
# res = blob_savedataloader
# return res
# _quantizer = GLOBAL_MAP.get_ele(NNDCT_KEYS.QUANTIZER)
# _quantizer.do_quantize = do_quantize.__get__(_quantizer)
quant_model = quantizer.quant_model
ori_forward = quant_model.forward
post_method = model.model[-1].post_process
def forward(*args, **kwargs):
out = ori_forward(*args, **kwargs)
return post_method(out)
quant_model.forward = forward
if dump_xmodel:
total = 1
else:
total = len(dataloader)
for batch_i, (img, targets, paths, shapes) in enumerate(tqdm(dataloader, desc=s, total=total)):
t1 = time_sync()
img = img.to(device, non_blocking=True)
img = img.half() if half else img.float() # uint8 to fp16/32
img /= 255.0 # 0 - 255 to 0.0 - 1.0
targets = targets.to(device)
nb, _, height, width = img.shape # batch size, channels, height, width
t2 = time_sync()
dt[0] += t2 - t1
with torch.no_grad():
# Run model
if nndct_quant:
out = quant_model(img)
out, train_out = out[0], out[1]
else:
out, train_out = model(img, augment=augment) # inference and training outputs
dt[1] += time_sync() - t2
# Compute loss
if compute_loss:
loss += compute_loss([x.float() for x in train_out], targets)[1] # box, obj, cls
# Run NMS
targets[:, 2:] *= torch.Tensor([width, height, width, height]).to(device) # to pixels
lb = [targets[targets[:, 0] == i, 1:] for i in range(nb)] if save_hybrid else [] # for autolabelling
t3 = time_sync()
out = non_max_suppression(out, conf_thres, iou_thres, labels=lb, multi_label=True, agnostic=single_cls)
dt[2] += time_sync() - t3
# Statistics per image
for si, pred in enumerate(out):
labels = targets[targets[:, 0] == si, 1:]
nl = len(labels)
tcls = labels[:, 0].tolist() if nl else [] # target class
path, shape = Path(paths[si]), shapes[si][0]
seen += 1
if len(pred) == 0:
if nl:
stats.append((torch.zeros(0, niou, dtype=torch.bool), torch.Tensor(), torch.Tensor(), tcls))
continue
# Predictions
if single_cls:
pred[:, 5] = 0
predn = pred.clone()
scale_coords(img[si].shape[1:], predn[:, :4], shape, shapes[si][1]) # native-space pred
# Evaluate
if nl:
tbox = xywh2xyxy(labels[:, 1:5]) # target boxes
scale_coords(img[si].shape[1:], tbox, shape, shapes[si][1]) # native-space labels
labelsn = torch.cat((labels[:, 0:1], tbox), 1) # native-space labels
correct = process_batch(predn, labelsn, iouv)
if plots:
confusion_matrix.process_batch(predn, labelsn)
else:
correct = torch.zeros(pred.shape[0], niou, dtype=torch.bool)
stats.append((correct.cpu(), pred[:, 4].cpu(), pred[:, 5].cpu(), tcls)) # (correct, conf, pcls, tcls)
# Save/log
if save_txt:
save_one_txt(predn, save_conf, shape, file=save_dir / 'labels' / (path.stem + '.txt'))
if save_json:
save_one_json(predn, jdict, path, class_map) # append to COCO-JSON dictionary
callbacks.run('on_val_image_end', pred, predn, path, names, img[si])
# Plot images
if plots and batch_i < 3:
f = save_dir / f'val_batch{batch_i}_labels.jpg' # labels
Thread(target=plot_images, args=(img, targets, paths, f, names), daemon=True).start()
f = save_dir / f'val_batch{batch_i}_pred.jpg' # predictions
Thread(target=plot_images, args=(img, output_to_target(out), paths, f, names), daemon=True).start()
if dump_xmodel:
break
if nndct_quant and quant_mode == 'calib':
quantizer.export_quant_config()
if dump_xmodel:
quantizer.export_xmodel(output_dir=output_dir.as_posix(), deploy_check=False)
return
# Compute statistics
stats = [np.concatenate(x, 0) for x in zip(*stats)] # to numpy
if len(stats) and stats[0].any():
p, r, ap, f1, ap_class = ap_per_class(*stats, plot=plots, save_dir=save_dir, names=names)
ap50, ap = ap[:, 0], ap.mean(1) # AP@0.5, AP@0.5:0.95
mp, mr, map50, map = p.mean(), r.mean(), ap50.mean(), ap.mean()
nt = np.bincount(stats[3].astype(np.int64), minlength=nc) # number of targets per class
else:
nt = torch.zeros(1)
# Print results
pf = '%20s' + '%11i' * 2 + '%11.3g' * 4 # print format
print(pf % ('all', seen, nt.sum(), mp, mr, map50, map))
# Print results per class
if (verbose or (nc < 50 and not training)) and nc > 1 and len(stats):
for i, c in enumerate(ap_class):
print(pf % (names[c], seen, nt[c], p[i], r[i], ap50[i], ap[i]))
# Print speeds
t = tuple(x / seen * 1E3 for x in dt) # speeds per image
if not training:
shape = (batch_size, 3, imgsz, imgsz)
print(f'Speed: %.1fms pre-process, %.1fms inference, %.1fms NMS per image at shape {shape}' % t)
# Plots
if plots:
confusion_matrix.plot(save_dir=save_dir, names=list(names.values()))
callbacks.run('on_val_end')
# Save JSON
if save_json and len(jdict):
w = Path(weights[0] if isinstance(weights, list) else weights).stem if weights is not None else '' # weights
anno_json = str('/group/dphi_algo/coco/annotations/annotations_2017/instances_val2017.json') # annotations json
pred_json = str(save_dir / f"{w}_predictions.json") # predictions json
print(f'\nEvaluating pycocotools mAP... saving {pred_json}...')
with open(pred_json, 'w') as f:
json.dump(jdict, f)
try: # https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocoEvalDemo.ipynb
check_requirements(['pycocotools'])
from pycocotools.coco import COCO
from pycocotools.cocoeval import COCOeval
anno = COCO(anno_json) # init annotations api
pred = anno.loadRes(pred_json) # init predictions api
eval = COCOeval(anno, pred, 'bbox')
if is_coco:
eval.params.imgIds = [int(Path(x).stem) for x in dataloader.dataset.img_files] # image IDs to evaluate
eval.evaluate()
eval.accumulate()
eval.summarize()
map, map50 = eval.stats[:2] # update results (mAP@0.5:0.95, mAP@0.5)
except Exception as e:
print(f'pycocotools unable to run: {e}')
# Return results
model.float() # for training
if not training:
s = f"\n{len(list(save_dir.glob('labels/*.txt')))} labels saved to {save_dir / 'labels'}" if save_txt else ''
print(f"Results saved to {colorstr('bold', save_dir)}{s}")
maps = np.zeros(nc) + map
for i, c in enumerate(ap_class):
maps[c] = ap[i]
return (mp, mr, map50, map, *(loss.cpu() / len(dataloader)).tolist()), maps, t
def parse_opt():
parser = argparse.ArgumentParser()
parser.add_argument('--data', type=str, default=ROOT / 'data/coco128.yaml', help='dataset.yaml path')
parser.add_argument('--weights', nargs='+', type=str, default=ROOT / 'yolov5s.pt', help='model.pt path(s)')
parser.add_argument('--batch-size', type=int, default=32, help='batch size')
parser.add_argument('--imgsz', '--img', '--img-size', type=int, default=640, help='inference size (pixels)')
parser.add_argument('--conf-thres', type=float, default=0.001, help='confidence threshold')
parser.add_argument('--iou-thres', type=float, default=0.6, help='NMS IoU threshold')
parser.add_argument('--task', default='val', help='train, val, test, speed or study')
parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
parser.add_argument('--single-cls', action='store_true', help='treat as single-class dataset')
parser.add_argument('--augment', action='store_true', help='augmented inference')
parser.add_argument('--verbose', action='store_true', help='report mAP by class')
parser.add_argument('--save-txt', action='store_true', help='save results to *.txt')
parser.add_argument('--save-hybrid', action='store_true', help='save label+prediction hybrid results to *.txt')
parser.add_argument('--save-conf', action='store_true', help='save confidences in --save-txt labels')
parser.add_argument('--save-json', action='store_true', help='save a COCO-JSON results file')
parser.add_argument('--project', default=ROOT / 'runs/val', help='save to project/name')
parser.add_argument('--name', default='exp', help='save to project/name')
parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
parser.add_argument('--half', action='store_true', help='use FP16 half-precision inference')
parser.add_argument('--quant_mode', default='calib', help='nndct quant mode')
parser.add_argument('--nndct_quant', action='store_true', help='use nndct quant model for inference')
parser.add_argument('--dump_xmodel', action='store_true', help='dump nndct xmodel')
parser.add_argument('--nndct_stat', type=int, required=False, default=0)
opt = parser.parse_args()
opt.data = check_yaml(opt.data) # check YAML
opt.save_json |= opt.data.endswith('coco.yaml')
opt.save_txt |= opt.save_hybrid
print_args(FILE.stem, opt)
return opt
def main(opt):
set_logging()
check_requirements(exclude=('tensorboard', 'thop'))
if opt.task in ('train', 'val', 'test'): # run normally
run(**vars(opt))
elif opt.task == 'speed': # speed benchmarks
# python val.py --task speed --data coco.yaml --batch 1 --weights yolov5n.pt yolov5s.pt...
for w in opt.weights if isinstance(opt.weights, list) else [opt.weights]:
run(opt.data, weights=w, batch_size=opt.batch_size, imgsz=opt.imgsz, conf_thres=.25, iou_thres=.45,
device=opt.device, save_json=False, plots=False)
elif opt.task == 'study': # run over a range of settings and save/plot
# python val.py --task study --data coco.yaml --iou 0.7 --weights yolov5n.pt yolov5s.pt...
x = list(range(256, 1536 + 128, 128)) # x axis (image sizes)
for w in opt.weights if isinstance(opt.weights, list) else [opt.weights]:
f = f'study_{Path(opt.data).stem}_{Path(w).stem}.txt' # filename to save to
y = [] # y axis
for i in x: # img-size
print(f'\nRunning {f} point {i}...')
r, _, t = run(opt.data, weights=w, batch_size=opt.batch_size, imgsz=i, conf_thres=opt.conf_thres,
iou_thres=opt.iou_thres, device=opt.device, save_json=opt.save_json, plots=False)
y.append(r + t) # results and times
np.savetxt(f, y, fmt='%10.4g') # save
os.system('zip -r study.zip study_*.txt')
plot_val_study(x=x) # plot
if __name__ == "__main__":
opt = parse_opt()
main(opt)
You have to remove some operations from the model head like exclude certain nodes refer to this issue https://github.com/ultralytics/yolov5/issues/1288
Quantified with the Yolov5 model, the MAP@0.5 is high(around 0.47), but the detection results are outrageous and unexpected
These days I have tried to do some quantification with yolov5_nano by using coco and coco128 datasets. The map@0.5 and map@0.95 after calib and test are both normal and satisfying. However, the picture results of these two tests are not satisfactory. The following is a more detailed description or code in the steps.
calib
I have run the below command to make calibration on yolov5_nano model by using coco datasets.
python val.py --data ./data/coco.yaml --weights ./float/yolov5n_float.pt --batch-size 16 --imgsz 640 --conf-thres 0.5 --iou-thres 0.65 --quant_mode calib --nndct_quant
The result printed in the terminal is:
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda' [VAIQ_NOTE]: Loading NNDCT kernels... val: data=./data/coco.yaml, weights=['./float/yolov5n_float.pt'], batch_size=16, imgsz=640, conf_thres=0.5, iou_thres=0.65, task=val, device=, single_cls=False, augment=False, verbose=False, save_txt=False, save_hybrid=False, save_conf=False, save_json=True, project=runs/val, name=exp, exist_ok=False, half=False, quant_mode=calib, nndct_quant=True, dump_xmodel=False, nndct_stat=0 YOLOv5 π v3.5-13-gf74ddc6ed torch 1.12.1 CPU from n params module arguments 0 -1 1 1760 models.common.Conv [3, 16, 6, 2, 2] 1 -1 1 4672 models.common.Conv [16, 32, 3, 2] 2 -1 1 4800 models.common.C3 [32, 32, 1] 3 -1 1 18560 models.common.Conv [32, 64, 3, 2] 4 -1 2 29184 models.common.C3 [64, 64, 2] 5 -1 1 73984 models.common.Conv [64, 128, 3, 2] 6 -1 3 156928 models.common.C3 [128, 128, 3] 7 -1 1 295424 models.common.Conv [128, 256, 3, 2] 8 -1 1 296448 models.common.C3 [256, 256, 1] 9 -1 1 164608 models.common.SPPF [256, 256, 5] 10 -1 1 33024 models.common.Conv [256, 128, 1, 1] 11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest'] 12 [-1, 6] 1 0 models.common.Concat [1] 13 -1 1 90880 models.common.C3 [256, 128, 1, False] 14 -1 1 8320 models.common.Conv [128, 64, 1, 1] 15 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest'] 16 [-1, 4] 1 0 models.common.Concat [1] 17 -1 1 22912 models.common.C3 [128, 64, 1, False] 18 -1 1 36992 models.common.Conv [64, 64, 3, 2] 19 [-1, 14] 1 0 models.common.Concat [1] 20 -1 1 74496 models.common.C3 [128, 128, 1, False] 21 -1 1 147712 models.common.Conv [128, 128, 3, 2] 22 [-1, 10] 1 0 models.common.Concat [1] 23 -1 1 296448 models.common.C3 [256, 256, 1, False] 24 [17, 20, 23] 1 115005 models.yolo.Detect [80, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [64, 128, 256]] /opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.7/site-packages/torch/functional.py:478: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1659484747659/work/aten/src/ATen/native/TensorShape.cpp:2894.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] Model Summary: 298 layers, 1872157 parameters, 1872157 gradients, 4.6 GFLOPs val: Scanning '../datasets/coco/val2017.cache' images and labels... 4952 found, 48 missing, 0 empty, 0 corrupted: 100%|βββββββββββββββββ| 5000/5000 [00:00<?, ?it/s] NNDCT quant dir: float/nndct_quant [VAIQ_WARN][QUANTIZER_TORCH_CUDA_UNAVAILABLE]: CUDA (HIP) is not available, change device to CPU [VAIQ_NOTE]: OS and CPU information: system --- Linux node --- ubuntu release --- 5.15.0-82-generic version --- #91~20.04.1-Ubuntu SMP Fri Aug 18 16:24:39 UTC 2023 machine --- x86_64 processor --- x86_64 [VAIQ_NOTE]: Tools version information: GCC --- GCC 9.4.0 python --- 3.7.12 pytorch --- 1.12.1 vai_q_pytorch --- 3.0.0+a44284e+torch1.12.1 [VAIQ_WARN][QUANTIZER_TORCH_CUDA_UNAVAILABLE]: CUDA (HIP) is not available, change device to CPU. [VAIQ_NOTE]: Quant config file is empty, use default quant configuration [VAIQ_NOTE]: Quantization calibration process start up... [VAIQ_NOTE]: =>Quant Module is in 'cpu'. [VAIQ_NOTE]: =>Parsing Model... [VAIQ_NOTE]: Start to trace and freeze model... [VAIQ_NOTE]: The input model Model is torch.nn.Module. [VAIQ_NOTE]: Finish tracing. [VAIQ_NOTE]: Processing ops... ββββββββββββββββββββββββββββββββββββββββββββββββββ| 205/205 [00:00<00:00, 1314.63it/s, OpInfo: name = return_0, type = Return] [VAIQ_NOTE]: =>Doing weights equalization... [VAIQ_NOTE]: =>Quantizable module is generated.(float/nndct_quant/Model.py) [VAIQ_NOTE]: =>Get module with quantization. Class Images Labels P R mAP@.5 mAP@.5:.95: 100%|βββββββββββββββββββββββββββββββββββ| 313/313 [3:07:21<00:00, 35.92s/it] [VAIQ_NOTE]: =>Exporting quant config.(float/nndct_quant/quant_info.json) all 5000 36335 0.815 0.156 0.486 0.333 Speed: 2.6ms pre-process, 2241.2ms inference, 1.4ms NMS per image at shape (16, 3, 640, 640) Evaluating pycocotools mAP... saving runs/val/exp62/yolov5n_float_predictions.json... /opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.7/site-packages/pkg_resources/__init__.py:119: PkgResourcesDeprecationWarning: 3.0.0-a44284e-torch1.12.1 is an invalid version and will not be supported in a future release PkgResourcesDeprecationWarning, requirements: pycocotools not found and is required by YOLOv5, attempting auto-update... requirements: 'pip install pycocotools' skipped (offline) pycocotools unable to run: No module named 'pycocotools' Results saved to runs/val/exp62
Judging by the results in the terminal, it is still relatively normal
test
I have also run the below command to make test on yolov5_nano model by using coco datasets.
python val.py --data ./data/coco.yaml --weights ./float/yolov5n_float.pt --batch-size 1 --imgsz 640 --conf-thres 0.5 --iou-thres 0.65 --quant_mode test --nndct_quant
The result printed in the terminal is:
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda' [VAIQ_NOTE]: Loading NNDCT kernels... val: data=./data/coco.yaml, weights=['./float/yolov5n_float.pt'], batch_size=1, imgsz=640, conf_thres=0.5, iou_thres=0.65, task=val, device=, single_cls=False, augment=False, verbose=False, save_txt=False, save_hybrid=False, save_conf=False, save_json=True, project=runs/val, name=exp, exist_ok=False, half=False, quant_mode=test, nndct_quant=True, dump_xmodel=False, nndct_stat=0 YOLOv5 π v3.5-13-gf74ddc6ed torch 1.12.1 CPU from n params module arguments 0 -1 1 1760 models.common.Conv [3, 16, 6, 2, 2] 1 -1 1 4672 models.common.Conv [16, 32, 3, 2] 2 -1 1 4800 models.common.C3 [32, 32, 1] 3 -1 1 18560 models.common.Conv [32, 64, 3, 2] 4 -1 2 29184 models.common.C3 [64, 64, 2] 5 -1 1 73984 models.common.Conv [64, 128, 3, 2] 6 -1 3 156928 models.common.C3 [128, 128, 3] 7 -1 1 295424 models.common.Conv [128, 256, 3, 2] 8 -1 1 296448 models.common.C3 [256, 256, 1] 9 -1 1 164608 models.common.SPPF [256, 256, 5] 10 -1 1 33024 models.common.Conv [256, 128, 1, 1] 11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest'] 12 [-1, 6] 1 0 models.common.Concat [1] 13 -1 1 90880 models.common.C3 [256, 128, 1, False] 14 -1 1 8320 models.common.Conv [128, 64, 1, 1] 15 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest'] 16 [-1, 4] 1 0 models.common.Concat [1] 17 -1 1 22912 models.common.C3 [128, 64, 1, False] 18 -1 1 36992 models.common.Conv [64, 64, 3, 2] 19 [-1, 14] 1 0 models.common.Concat [1] 20 -1 1 74496 models.common.C3 [128, 128, 1, False] 21 -1 1 147712 models.common.Conv [128, 128, 3, 2] 22 [-1, 10] 1 0 models.common.Concat [1] 23 -1 1 296448 models.common.C3 [256, 256, 1, False] 24 [17, 20, 23] 1 115005 models.yolo.Detect [80, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [64, 128, 256]] /opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.7/site-packages/torch/functional.py:478: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1659484747659/work/aten/src/ATen/native/TensorShape.cpp:2894.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] Model Summary: 298 layers, 1872157 parameters, 1872157 gradients, 4.6 GFLOPs val: Scanning '../datasets/coco/val2017.cache' images and labels... 4952 found, 48 missing, 0 empty, 0 corrupted: 100%|βββββββββββββββββ| 5000/5000 [00:00<?, ?it/s] NNDCT quant dir: float/nndct_quant [VAIQ_WARN][QUANTIZER_TORCH_CUDA_UNAVAILABLE]: CUDA (HIP) is not available, change device to CPU [VAIQ_NOTE]: OS and CPU information: system --- Linux node --- ubuntu release --- 5.15.0-82-generic version --- #91~20.04.1-Ubuntu SMP Fri Aug 18 16:24:39 UTC 2023 machine --- x86_64 processor --- x86_64 [VAIQ_NOTE]: Tools version information: GCC --- GCC 9.4.0 python --- 3.7.12 pytorch --- 1.12.1 vai_q_pytorch --- 3.0.0+a44284e+torch1.12.1 [VAIQ_WARN][QUANTIZER_TORCH_CUDA_UNAVAILABLE]: CUDA (HIP) is not available, change device to CPU. [VAIQ_NOTE]: Quant config file is empty, use default quant configuration [VAIQ_NOTE]: Quantization test process start up... [VAIQ_NOTE]: =>Quant Module is in 'cpu'. [VAIQ_NOTE]: =>Parsing Model... [VAIQ_NOTE]: Start to trace and freeze model... [VAIQ_NOTE]: The input model Model is torch.nn.Module. [VAIQ_NOTE]: Finish tracing. [VAIQ_NOTE]: Processing ops... ββββββββββββββββββββββββββββββββββββββββββββββββββ| 205/205 [00:00<00:00, 1068.12it/s, OpInfo: name = return_0, type = Return] [VAIQ_NOTE]: =>Doing weights equalization... [VAIQ_NOTE]: =>Quantizable module is generated.(float/nndct_quant/Model.py) [VAIQ_NOTE]: =>Get module with quantization. Class Images Labels P R mAP@.5 mAP@.5:.95: 100%|βββββββββββββββββββββββββββββββββββ| 5000/5000 [45:36<00:00, 1.83it/s] all 5000 36335 0.794 0.157 0.476 0.319 Speed: 2.2ms pre-process, 522.4ms inference, 1.2ms NMS per image at shape (1, 3, 640, 640) Evaluating pycocotools mAP... saving runs/val/exp65/yolov5n_float_predictions.json...
It is still normal.
unexpected results
However, as soon as I opened the folder that contains the images after running test and calib, I got shocked. For example, even a vase, which is very simple for detection in Figure 1 below, cannot be detected in the results of the test after quantification is completed. Even the most classic pictures of bears in the Coco dataset could not be detected, which made me doubt the correctness of the quantification results.
So I'm wondering why the map@0.5 is high but the actual detection results are not very reasonable and appliable, is there any way to solve such problems?
Hello, can you please tell me what version of yolov5 are you using?
I am using yolov5-6.0 to do all the training and quantification.
I am using yolov5-6.0 to do all the training and quantification.
After deploying the model on the fpga, I noticed that the objects are detected but the boxes are very small. I think that the problem is with the prototxt file I'm using. Can you share with me the prototxt you are using?
Where can i get this val.py?
Hello, does anyone have any idea where i can get this val.py script. thanks.
Quantified with the Yolov5 model, the MAP@0.5 is high(around 0.47), but the detection results are outrageous and unexpected
These days I have tried to do some quantification with yolov5_nano by using coco and coco128 datasets. The map@0.5 and map@0.95 after calib and test are both normal and satisfying. However, the picture results of these two tests are not satisfactory. The following is a more detailed description or code in the steps.
calib
I have run the below command to make calibration on yolov5_nano model by using coco datasets.
python val.py --data ./data/coco.yaml --weights ./float/yolov5n_float.pt --batch-size 16 --imgsz 640 --conf-thres 0.5 --iou-thres 0.65 --quant_mode calib --nndct_quant
The result printed in the terminal is:
Judging by the results in the terminal, it is still relatively normal
test
I have also run the below command to make test on yolov5_nano model by using coco datasets.
python val.py --data ./data/coco.yaml --weights ./float/yolov5n_float.pt --batch-size 1 --imgsz 640 --conf-thres 0.5 --iou-thres 0.65 --quant_mode test --nndct_quant
The result printed in the terminal is:
It is still normal.
unexpected results
However, as soon as I opened the folder that contains the images after running test and calib, I got shocked. For example, even a vase, which is very simple for detection in Figure 1 below, cannot be detected in the results of the test after quantification is completed. Even the most classic pictures of bears in the Coco dataset could not be detected, which made me doubt the correctness of the quantification results.
So I'm wondering why the map@0.5 is high but the actual detection results are not very reasonable and appliable, is there any way to solve such problems?