AFallDay commented 4 weeks ago

Search before asking

[X] I have searched the YOLOv5 issues and discussions and found no similar questions.

Question

May I ask yolov5 how to port the method of calculating P, R, AP, MAP in val.py to adapt to detect.py, what code need to be packed? and where should I add or change it? Please answer this question, thanks!

The code for detect.py is as follows：

import argparse import csv import os import platform import sys from pathlib import Path

import torch

FILE = Path(file).resolve() ROOT = FILE.parents[0] # YOLOv5 root directory if str(ROOT) not in sys.path: sys.path.append(str(ROOT)) # add ROOT to PATH ROOT = Path(os.path.relpath(ROOT, Path.cwd())) # relative

from ultralytics.utils.plotting import Annotator, colors, save_one_box

from models.common import DetectMultiBackend from utils.dataloaders import IMG_FORMATS, VID_FORMATS, LoadImages, LoadScreenshots, LoadStreams from utils.general import (LOGGER, Profile, check_file, check_img_size, check_imshow, check_requirements, colorstr, cv2, increment_path, non_max_suppression, print_args, scale_boxes, strip_optimizer, xyxy2xywh) from utils.torch_utils import select_device, smart_inference_mode

@smart_inference_mode() def run( weights='D:\yolov5/runs/train\exp16\weights/best.pt', # model path or triton URL source='D:/dataset/images/val', # file/dir/URL/glob/screen/0(webcam) data='D:/yolov5/VOC.yaml', # dataset.yaml path imgsz=(640, 640), # inference size (height, width) conf_thres=0.25, # confidence threshold iou_thres=0.45, # NMS IOU threshold max_det=1000, # maximum detections per image device='', # cuda device, i.e. 0 or 0,1,2,3 or cpu view_img=False, # show results save_txt=False, # save results to *.txt save_csv=False, # save results in CSV format save_conf=False, # save confidences in --save-txt labels save_crop=False, # save cropped prediction boxes nosave=False, # do not save images/videos classes=None, # filter by class: --class 0, or --class 0 2 3 agnostic_nms=False, # class-agnostic NMS augment=False, # augmented inference visualize=False, # visualize features update=False, # update all models project=ROOT / 'runs/detect', # save results to project/name name='exp', # save results to project/name exist_ok=False, # existing project/name ok, do not increment line_thickness=3, # bounding box thickness (pixels) hide_labels=False, # hide labels hide_conf=False, # hide confidences half=False, # use FP16 half-precision inference dnn=False, # use OpenCV DNN for ONNX inference vid_stride=1, # video frame-rate stride ): source = str(source) save_img = not nosave and not source.endswith('.txt') # save inference images is_file = Path(source).suffix[1:] in (IMG_FORMATS + VID_FORMATS) is_url = source.lower().startswith(('rtsp://', 'rtmp://', 'http://', 'https://')) webcam = source.isnumeric() or source.endswith('.streams') or (is_url and not is_file) screenshot = source.lower().startswith('screen') if is_url and is_file: source = check_file(source) # download

# Directories
save_dir = increment_path(Path(project) / name, exist_ok=exist_ok)  # increment run
(save_dir / 'labels' if save_txt else save_dir).mkdir(parents=True, exist_ok=True)  # make dir

# Load model
device = select_device(device)
# torch.load('D:/Torch-Pruning-master/Torch-Pruning-master/examples/yolov5/pruned_model.pt')
model = DetectMultiBackend(weights, device=device, dnn=dnn, data=data, fp16=half)
stride, names, pt = model.stride, model.names, model.pt
imgsz = check_img_size(imgsz, s=stride)  # check image size

# Dataloader
bs = 1  # batch_size
if webcam:
    view_img = check_imshow(warn=True)
    dataset = LoadStreams(source, img_size=imgsz, stride=stride, auto=pt, vid_stride=vid_stride)
    bs = len(dataset)
elif screenshot:
    dataset = LoadScreenshots(source, img_size=imgsz, stride=stride, auto=pt)
else:
    dataset = LoadImages(source, img_size=imgsz, stride=stride, auto=pt, vid_stride=vid_stride)
vid_path, vid_writer = [None] * bs, [None] * bs

# Run inference
model.warmup(imgsz=(1 if pt or model.triton else bs, 3, *imgsz))  # warmup
seen, windows, dt = 0, [], (Profile(), Profile(), Profile())
for path, im, im0s, vid_cap, s in dataset:
    with dt[0]:
        im = torch.from_numpy(im).to(model.device)
        im = im.half() if model.fp16 else im.float()  # uint8 to fp16/32
        im /= 255  # 0 - 255 to 0.0 - 1.0
        if len(im.shape) == 3:
            im = im[None]  # expand for batch dim

    # Inference
    with dt[1]:
        visualize = increment_path(save_dir / Path(path).stem, mkdir=True) if visualize else False
        pred = model(im, augment=augment, visualize=visualize)

    # NMS
    with dt[2]:
        pred = non_max_suppression(pred, conf_thres, iou_thres, classes, agnostic_nms, multi_label=True,max_det=max_det)

    # Second-stage classifier (optional)
    # pred = utils.general.apply_classifier(pred, classifier_model, im, im0s)

    # Define the path for the CSV file
    csv_path = save_dir / 'predictions.csv'

    # Create or append to the CSV file
    def write_to_csv(image_name, prediction, confidence):
        data = {'Image Name': image_name, 'Prediction': prediction, 'Confidence': confidence}
        with open(csv_path, mode='a', newline='') as f:
            writer = csv.DictWriter(f, fieldnames=data.keys())
            if not csv_path.is_file():
                writer.writeheader()
            writer.writerow(data)

    # Process predictions
    for i, det in enumerate(pred):  # per image
        seen += 1
        if webcam:  # batch_size >= 1
            p, im0, frame = path[i], im0s[i].copy(), dataset.count
            s += f'{i}: '
        else:
            p, im0, frame = path, im0s.copy(), getattr(dataset, 'frame', 0)

        p = Path(p)  # to Path
        save_path = str(save_dir / p.name)  # im.jpg
        txt_path = str(save_dir / 'labels' / p.stem) + ('' if dataset.mode == 'image' else f'_{frame}')  # im.txt
        s += '%gx%g ' % im.shape[2:]  # print string
        gn = torch.tensor(im0.shape)[[1, 0, 1, 0]]  # normalization gain whwh
        imc = im0.copy() if save_crop else im0  # for save_crop
        annotator = Annotator(im0, line_width=line_thickness, example=str(names))
        if len(det):
            # Rescale boxes from img_size to im0 size
            det[:, :4] = scale_boxes(im.shape[2:], det[:, :4], im0.shape).round()

            # Print results
            for c in det[:, 5].unique():
                n = (det[:, 5] == c).sum()  # detections per class
                s += f"{n} {names[int(c)]}{'s' * (n > 1)}, "  # add to string

            # Write results
            for *xyxy, conf, cls in reversed(det):
                c = int(cls)  # integer class
                label = names[c] if hide_conf else f'{names[c]}'
                confidence = float(conf)
                confidence_str = f'{confidence:.2f}'

                if save_csv:
                    write_to_csv(p.name, label, confidence_str)

                if save_txt:  # Write to file
                    xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist()  # normalized xywh
                    line = (cls, *xywh, conf) if save_conf else (cls, *xywh)  # label format
                    with open(f'{txt_path}.txt', 'a') as f:
                        f.write(('%g ' * len(line)).rstrip() % line + '\n')

                if save_img or save_crop or view_img:  # Add bbox to image
                    c = int(cls)  # integer class
                    label = None if hide_labels else (names[c] if hide_conf else f'{names[c]} {conf:.2f}')
                    annotator.box_label(xyxy, label, color=colors(c, True))
                if save_crop:
                    save_one_box(xyxy, imc, file=save_dir / 'crops' / names[c] / f'{p.stem}.jpg', BGR=True)

        # Stream results
        im0 = annotator.result()
        if view_img:
            if platform.system() == 'Linux' and p not in windows:
                windows.append(p)
                cv2.namedWindow(str(p), cv2.WINDOW_NORMAL | cv2.WINDOW_KEEPRATIO)  # allow window resize (Linux)
                cv2.resizeWindow(str(p), im0.shape[1], im0.shape[0])
            cv2.imshow(str(p), im0)
            cv2.waitKey(1)  # 1 millisecond

        # Save results (image with detections)
        if save_img:
            if dataset.mode == 'image':
                cv2.imwrite(save_path, im0)
            else:  # 'video' or 'stream'
                if vid_path[i] != save_path:  # new video
                    vid_path[i] = save_path
                    if isinstance(vid_writer[i], cv2.VideoWriter):
                        vid_writer[i].release()  # release previous video writer
                    if vid_cap:  # video
                        fps = vid_cap.get(cv2.CAP_PROP_FPS)
                        w = int(vid_cap.get(cv2.CAP_PROP_FRAME_WIDTH))
                        h = int(vid_cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
                    else:  # stream
                        fps, w, h = 30, im0.shape[1], im0.shape[0]
                    save_path = str(Path(save_path).with_suffix('.mp4'))  # force *.mp4 suffix on results videos
                    vid_writer[i] = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc(*'mp4v'), fps, (w, h))
                vid_writer[i].write(im0)

    # Print time (inference-only)
    LOGGER.info(f"{s}{'' if len(det) else '(no detections), '}{dt[1].dt * 1E3:.1f}ms")

# Print results
t = tuple(x.t / seen * 1E3 for x in dt)  # speeds per image
LOGGER.info(f'Speed: %.1fms pre-process, %.1fms inference, %.1fms NMS per image at shape {(1, 3, *imgsz)}' % t)
if save_txt or save_img:
    s = f"\n{len(list(save_dir.glob('labels/*.txt')))} labels saved to {save_dir / 'labels'}" if save_txt else ''
    LOGGER.info(f"Results saved to {colorstr('bold', save_dir)}{s}")
if update:
    strip_optimizer(weights[0])  # update model (to fix SourceChangeWarning)

def parse_opt(): parser = argparse.ArgumentParser()

parser.add_argument('--weights', nargs='+', type=str, default= 'C:/Users\hp\Desktop\AF\guipian\model\exp10\weights/best.pt', help='model path or triton URL')

parser.add_argument('--weights',nargs='+',type=str,default='D:\yolov5/runs\exp_val\exp9\weights/best.pt',help='model path(s)')
parser.add_argument('--source', type=str, default='D:\guipian\dataset\images/val3', help='file/dir/URL/glob/screen/0(webcam)')
parser.add_argument('--data', type=str, default='D:/yolov5/VOC.yaml', help='(optional) dataset.yaml path')
parser.add_argument('--imgsz', '--img', '--img-size', nargs='+', type=int, default=[640], help='inference size h,w')
parser.add_argument('--conf-thres', type=float, default=0.25, help='confidence threshold')
parser.add_argument('--iou-thres', type=float, default=0.5, help='NMS IoU threshold')
parser.add_argument('--max-det', type=int, default=300, help='maximum detections per image')
parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
parser.add_argument('--view-img', action='store_true', help='show results')
parser.add_argument('--save-txt',default='True', action='store_true', help='save results to *.txt')
parser.add_argument('--save-csv', action='store_true', help='save results in CSV format')
parser.add_argument('--save-conf',default='True', action='store_true', help='save confidences in --save-txt labels')
parser.add_argument('--save-crop', action='store_true', help='save cropped prediction boxes')
parser.add_argument('--nosave', action='store_true', help='do not save images/videos')
parser.add_argument('--classes', nargs='+', type=int, help='filter by class: --classes 0, or --classes 0 2 3')
parser.add_argument('--agnostic-nms', action='store_true', help='class-agnostic NMS')
parser.add_argument('--augment', action='store_true', help='augmented inference')
parser.add_argument('--visualize', action='store_true', help='visualize features')
parser.add_argument('--update', action='store_true', help='update all models')
parser.add_argument('--project', default=ROOT / 'runs/detect', help='save results to project/name')
parser.add_argument('--name', default='exp', help='save results to project/name')
parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
parser.add_argument('--line-thickness', default=3, type=int, help='bounding box thickness (pixels)')
parser.add_argument('--hide-labels', default=False, action='store_true', help='hide labels')
parser.add_argument('--hide-conf', default=False, action='store_true', help='hide confidences')
parser.add_argument('--half', action='store_true', help='use FP16 half-precision inference')
parser.add_argument('--dnn', action='store_true', help='use OpenCV DNN for ONNX inference')
parser.add_argument('--vid-stride', type=int, default=1, help='video frame-rate stride')
opt = parser.parse_args()
opt.imgsz *= 2 if len(opt.imgsz) == 1 else 1  # expand
print_args(vars(opt))
return opt

def main(opt): check_requirements(ROOT / 'requirements.txt', exclude=('tensorboard', 'thop')) run(**vars(opt))

if name == 'main': opt = parse_opt() main(opt)

The code for val.py is as follows： import argparse import json import os import subprocess import sys from pathlib import Path

import numpy as np import torch from tqdm import tqdm

FILE = Path(file).resolve() ROOT = FILE.parents[0] # YOLOv5 root directory if str(ROOT) not in sys.path: sys.path.append(str(ROOT)) # add ROOT to PATH ROOT = Path(os.path.relpath(ROOT, Path.cwd())) # relative

from models.common import DetectMultiBackend from utils.callbacks import Callbacks from utils.dataloaders import create_dataloader from utils.general import (LOGGER, TQDM_BAR_FORMAT, Profile, check_dataset, check_img_size, check_requirements, check_yaml, coco80_to_coco91_class, colorstr, increment_path, non_max_suppression, print_args, scale_boxes, xywh2xyxy, xyxy2xywh) from utils.metrics import ConfusionMatrix, ap_per_class, box_iou from utils.plots import output_to_target, plot_images, plot_val_study from utils.torch_utils import select_device, smart_inference_mode from PIL import Image, ImageDraw

def save_one_txt(predn, save_conf, shape, file):

Save one txt result

gn = torch.tensor(shape)[[1, 0, 1, 0]]  # normalization gain whwh
for *xyxy, conf, cls in predn.tolist():
    xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist()  # normalized xywh
    line = (cls, *xywh, conf) if save_conf else (cls, *xywh)  # label format
    with open(file, 'a') as f:
        f.write(('%g ' * len(line)).rstrip() % line + '\n')

def save_one_json(predn, jdict, path, class_map):

Save one JSON result {"image_id": 42, "category_id": 18, "bbox": [258.15, 41.29, 348.26, 243.78], "score": 0.236}

image_id = int(path.stem) if path.stem.isnumeric() else path.stem
box = xyxy2xywh(predn[:, :4])  # xywh
box[:, :2] -= box[:, 2:] / 2  # xy center to top-left corner
for p, b in zip(predn.tolist(), box.tolist()):
    jdict.append({
        'image_id': image_id,
        'category_id': class_map[int(p[5])],
        'bbox': [round(x, 3) for x in b],
        'score': round(p[4], 5)})

def process_batch(detections, labels, iouv): """ Return correct prediction matrix Arguments: detections (array[N, 6]), x1, y1, x2, y2, conf, class labels (array[M, 5]), class, x1, y1, x2, y2 Returns: correct (array[N, 10]), for 10 IoU levels """ correct = np.zeros((detections.shape[0], iouv.shape[0])).astype(bool) iou = box_iou(labels[:, 1:], detections[:, :4]) correct_class = labels[:, 0:1] == detections[:, 5] for i in range(len(iouv)): x = torch.where((iou >= iouv[i]) & correct_class) # IoU > threshold and classes match if x[0].shape[0]: matches = torch.cat((torch.stack(x, 1), iou[x[0], x[1]][:, None]), 1).cpu().numpy() # [label, detect, iou] if x[0].shape[0] > 1: matches = matches[matches[:, 2].argsort()[::-1]] matches = matches[np.unique(matches[:, 1], return_index=True)[1]]

matches = matches[matches[:, 2].argsort()[::-1]]

            matches = matches[np.unique(matches[:, 0], return_index=True)[1]]
        correct[matches[:, 1].astype(int), i] = True
return torch.tensor(correct, dtype=torch.bool, device=iouv.device)

@smart_inference_mode() def run( data= 'D:/yolov5/VOC.yaml', weights='D:\guipian\model\exp16\weights/best.pt', # model.pt path(s) batch_size=1, # batch size imgsz=640, # inference size (pixels) conf_thres=0.001, # confidence threshold iou_thres=0.6, # NMS IoU threshold max_det=300, # maximum detections per image task='val', # train, val, test, speed or study device='', # cuda device, i.e. 0 or 0,1,2,3 or cpu workers=0, # max dataloader workers (per RANK in DDP mode) single_cls=False, # treat as single-class dataset augment=False, # augmented inference verbose=False, # verbose output save_txt=False, # save results to .txt save_hybrid=False, # save label+prediction hybrid results to .txt save_conf=False, # save confidences in --save-txt labels save_json=False, # save a COCO-JSON results file project=ROOT / 'runs/val', # save to project/name name='exp', # save to project/name exist_ok=False, # existing project/name ok, do not increment half=True, # use FP16 half-precision inference dnn=False, # use OpenCV DNN for ONNX inference model=None, dataloader=None, save_dir=Path(''), plots=True, callbacks=Callbacks(), compute_loss=None, ):

Initialize/load model and set device

training = model is not None
if training:  # called by train.py
    device, pt, jit, engine = next(model.parameters()).device, True, False, False  # get model device, PyTorch model
    half &= device.type != 'cpu'  # half precision only supported on CUDA
    model.half() if half else model.float()
else:  # called directly
    device = select_device(device, batch_size=batch_size)

    # Directories
    save_dir = increment_path(Path(project) / name, exist_ok=exist_ok)  # increment run
    (save_dir / 'labels' if save_txt else save_dir).mkdir(parents=True, exist_ok=True)  # make dir

    # Load model
    model = DetectMultiBackend(weights, device=device, dnn=dnn, data=data, fp16=half)
    stride, pt, jit, engine = model.stride, model.pt, model.jit, model.engine
    imgsz = check_img_size(imgsz, s=stride)  # check image size
    half = model.fp16  # FP16 supported on limited backends with CUDA
    if engine:
        batch_size = model.batch_size
    else:
        device = model.device
        if not (pt or jit):
            batch_size = 1  # export.py models default to batch-size 1
            LOGGER.info(f'Forcing --batch-size 1 square inference (1,3,{imgsz},{imgsz}) for non-PyTorch models')

    # Data
    data = check_dataset(data)  # check

# Configure
model.eval()
cuda = device.type != 'cpu'
is_coco = isinstance(data.get('val'), str) and data['val'].endswith(f'coco{os.sep}val2017.txt')  # COCO dataset
nc = 1 if single_cls else int(data['nc'])  # number of classes
iouv = torch.linspace(0.5, 0.95, 10, device=device)  # iou vector for mAP@0.5:0.95
niou = iouv.numel()

# Dataloader
if not training:
    if pt and not single_cls:  # check --weights are trained on --data
        ncm = model.model.nc
        assert ncm == nc, f'{weights} ({ncm} classes) trained on different --data than what you passed ({nc} ' \
                          f'classes). Pass correct combination of --weights and --data that are trained together.'
    model.warmup(imgsz=(1 if pt else batch_size, 3, imgsz, imgsz))  # warmup
    pad, rect = (0.0, False) if task == 'speed' else (0.5, pt)  # square inference for benchmarks
    task = task if task in ('train', 'val', 'test') else 'val'  # path to train/val/test images
    dataloader = create_dataloader(data[task],
                                   imgsz,
                                   batch_size,
                                   stride,
                                   single_cls,
                                   pad=pad,
                                   rect=rect,
                                   workers=workers,
                                   prefix=colorstr(f'{task}: '))[0]

seen = 0
confusion_matrix = ConfusionMatrix(nc=nc)
names = model.names if hasattr(model, 'names') else model.module.names  # get class names
if isinstance(names, (list, tuple)):  # old format
    names = dict(enumerate(names))
class_map = coco80_to_coco91_class() if is_coco else list(range(1000))
s = ('%22s' + '%11s' * 6) % ('Class', 'Images', 'Instances', 'P', 'R', 'mAP50', 'mAP50-95')
tp, fp, p, r, f1, mp, mr, map50, ap50, map = 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0
dt = Profile(), Profile(), Profile()  # profiling times
loss = torch.zeros(3, device=device)
jdict, stats, ap, ap_class = [], [], [], []
callbacks.run('on_val_start')
pbar = tqdm(dataloader, desc=s, bar_format=TQDM_BAR_FORMAT)  # progress bar

for batch_i, (im, targets, paths, shapes) in enumerate(pbar):
    callbacks.run('on_val_batch_start')
    with dt[0]:
        if cuda:
            im = im.to(device, non_blocking=True)
            targets = targets.to(device)
        im = im.half() if half else im.float()  # uint8 to fp16/32
        im /= 255  # 0 - 255 to 0.0 - 1.0
        nb, _, height, width = im.shape  # batch size, channels, height, width

    # Inference
    with dt[1]:
        preds, train_out = model(im) if compute_loss else (model(im, augment=augment), None)

    # Loss
    if compute_loss:
        loss += compute_loss(train_out, targets)[1]  # box, obj, cls

    # NMS
    targets[:, 2:] *= torch.tensor((width, height, width, height), device=device)  # to pixels
    lb = [targets[targets[:, 0] == i, 1:] for i in range(nb)] if save_hybrid else []  # for autolabelling
    with dt[2]:
        preds = non_max_suppression(preds,
                                    conf_thres,
                                    iou_thres,
                                    labels=lb,
                                    multi_label=True,
                                    agnostic=single_cls,
                                    max_det=max_det)

    # Metrics
    for si, pred in enumerate(preds):
        labels = targets[targets[:, 0] == si, 1:]     # target[:, 0]为标签属于哪张图片的编号
        nl, npr = labels.shape[0], pred.shape[0]  # number of labels, predictions    nl,第si张图片的gt个数
        path, shape = Path(paths[si]), shapes[si][0]
        correct = torch.zeros(npr, niou, dtype=torch.bool, device=device)  # init
        seen += 1

        if npr == 0:        # 如果预测为空，则添加空的信息到stats里,npr是预测的目标数量，如果没有预测到目标，则为0。
            if nl:      #nl 表示真实标签的数量。
                stats.append((correct, *torch.zeros((2, 0), device=device), labels[:, 0]))
                # print(f'stats1:{stats}')
                if plots:
                    confusion_matrix.process_batch(detections=None, labels=labels[:, 0])
            continue

        # Predictions
        if single_cls:
            pred[:, 5] = 0
        predn = pred.clone()
        scale_boxes(im[si].shape[1:], predn[:, :4], shape, shapes[si][1])  # native-space pred  将预测坐标映射到原图img中

        # Evaluate
        if nl:
            tbox = xywh2xyxy(labels[:, 1:5])  # target boxes 获得xyxy格式的框
            scale_boxes(im[si].shape[1:], tbox, shape, shapes[si][1])  # native-space labels  将预测框映射到原图img
            labelsn = torch.cat((labels[:, 0:1], tbox), 1)  # native-space labels
            correct = process_batch(predn, labelsn, iouv)
            if plots:
                confusion_matrix.process_batch(predn, labelsn)  #计算混淆矩阵 confusion_matrix
        stats.append((correct, pred[:, 4], pred[:, 5], labels[:, 0]))  # (correct, conf, pcls, tcls)
        # print(f'stats2:{stats[0]}')

        # Save/log
        if save_txt:
            save_one_txt(predn, save_conf, shape, file=save_dir / 'labels' / f'{path.stem}.txt')
        if save_json:
            save_one_json(predn, jdict, path, class_map)  # append to COCO-JSON dictionary
        callbacks.run('on_val_image_end', pred, predn, path, names, im[si])

    # Plot images # 画出前三个batch的图片的ground truth和预测框predictions(两个图)一起保存
    if plots and batch_i < 100:
        plot_images(im, targets, paths, save_dir / f'val_batch{batch_i}_labels.jpg', names)  # labels
        plot_images(im, output_to_target(preds), paths, save_dir / f'val_batch{batch_i}_pred.jpg', names)  # pred
    callbacks.run('on_val_batch_end', batch_i, im, targets, paths, shapes, preds)

# Compute metrics   计算map
stats = [torch.cat(x, 0).cpu().numpy() for x in zip(*stats)]    # to numpy   stats(concat后): list{4} correct, conf, pcls, tcls  统计出的整个数据集的GT
# print(f'stats3:{stats}')
if len(stats) and stats[0].any():  # stats[0].any(): stats[0]是否全部为False, 是则返回 False, 如果有一个为 True, 则返回 True
    # p: [nc] 最大平均f1时每个类别的precision
    # r: [nc] 最大平均f1时每个类别的recall
    # ap: [71, 10] 数据集每个类别在10个iou阈值下的mAP
    # f1 [nc] 最大平均f1时每个类别的f1
    # ap_class: [nc] 返回数据集中所有的类别index
    tp, fp, p, r, f1, ap, ap_class = ap_per_class(*stats, plot=plots, save_dir=save_dir, names=names)
    ap50, ap = ap[:, 0], ap.mean(1)  # AP@0.5, AP@0.5:0.95
    # mp: [1] 所有类别的平均precision(最大f1时)
    # mr: [1] 所有类别的平均recall(最大f1时)
    # map50: [1] 所有类别的平均mAP@0.5
    # map: [1] 所有类别的平均mAP@0.5:0.95
    mp, mr, map50, map = p.mean(), r.mean(), ap50.mean(), ap.mean()
nt = np.bincount(stats[3].astype(int), minlength=nc)  # number of targets per class   nt: [nc] 统计出整个数据集的gt框中数据集各个类别的个数

# Print results
pf = '%22s' + '%11i' * 2 + '%11.3g' * 4  # print format
LOGGER.info(pf % ('all', seen, nt.sum(), mp, mr, map50, map))
if nt.sum() == 0:
    LOGGER.warning(f'WARNING ⚠️ no labels found in {task} set, can not compute metrics without labels')

# Print results per class
if (verbose or (nc < 50 and not training)) and nc > 1 and len(stats):
    for i, c in enumerate(ap_class):
        LOGGER.info(pf % (names[c], seen, nt[c], p[i], r[i], ap50[i], ap[i]))

# Print speeds
t = tuple(x.t / seen * 1E3 for x in dt)  # speeds per image
if not training:
    shape = (batch_size, 3, imgsz, imgsz)
    LOGGER.info(f'Speed: %.1fms pre-process, %.1fms inference, %.1fms NMS per image at shape {shape}' % t)

# Plots
if plots:
    confusion_matrix.plot(save_dir=save_dir, names=list(names.values()))
    callbacks.run('on_val_end', nt, tp, fp, p, r, f1, ap, ap50, ap_class, confusion_matrix)

# Return results
model.float()  # for training
if not training:
    s = f"\n{len(list(save_dir.glob('labels/*.txt')))} labels saved to {save_dir / 'labels'}" if save_txt else ''
    LOGGER.info(f"Results saved to {colorstr('bold', save_dir)}{s}")
maps = np.zeros(nc) + map
for i, c in enumerate(ap_class):
    maps[c] = ap[i]
return (mp, mr, map50, map, *(loss.cpu() / len(dataloader)).tolist()), maps, t

def parse_opt(): parser = argparse.ArgumentParser() parser.add_argument('--data', type=str, default='D:/yolov5/VOC.yaml', help='dataset.yaml path') parser.add_argument('--weights', nargs='+', type=str, default='D:\yolov5/runs\exp_val\exp9/weights/best.pt', help='model path(s)') parser.add_argument('--batch-size', type=int, default=1, help='batch size') parser.add_argument('--imgsz', '--img', '--img-size', type=int, default=640, help='inference size (pixels)') parser.add_argument('--conf-thres', type=float, default=0.25, help='confidence threshold') parser.add_argument('--iou-thres', type=float, default=0.5, help='NMS IoU threshold') parser.add_argument('--max-det', type=int, default=300, help='maximum detections per image') parser.add_argument('--task', default='val', help='train, val, test, speed or study') parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu') parser.add_argument('--workers', type=int, default=8, help='max dataloader workers (per RANK in DDP mode)') parser.add_argument('--single-cls', action='store_true', help='treat as single-class dataset') parser.add_argument('--augment', action='store_true', help='augmented inference') parser.add_argument('--verbose', default=True,action='store_true', help='report mAP by class') parser.add_argument('--save-txt', action='store_true', help='save results to .txt') parser.add_argument('--save-hybrid', action='store_true', help='save label+prediction hybrid results to .txt') parser.add_argument('--save-conf', action='store_true', help='save confidences in --save-txt labels') parser.add_argument('--save-json', action='store_true', help='save a COCO-JSON results file') parser.add_argument('--project', default=ROOT / 'runs/exp_val_map', help='save to project/name') parser.add_argument('--name', default='exp', help='save to project/name') parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment') parser.add_argument('--half', action='store_true', help='use FP16 half-precision inference') parser.add_argument('--dnn', action='store_true', help='use OpenCV DNN for ONNX inference') opt = parser.parse_args() opt.data = check_yaml(opt.data) # check YAML opt.save_json |= opt.data.endswith('coco.yaml') opt.save_txt |= opt.save_hybrid print_args(vars(opt)) return opt

def main(opt): check_requirements(ROOT / 'requirements.txt', exclude=('tensorboard', 'thop'))

if opt.task in ('train', 'val', 'test'):  # run normally
    if opt.conf_thres > 0.001:  # https://github.com/ultralytics/yolov5/issues/1466
        LOGGER.info(f'WARNING ⚠️ confidence threshold {opt.conf_thres} > 0.001 produces invalid results')
    if opt.save_hybrid:
        LOGGER.info('WARNING ⚠️ --save-hybrid will return high mAP from hybrid labels, not from predictions alone')
    run(**vars(opt))

else:
    weights = opt.weights if isinstance(opt.weights, list) else [opt.weights]
    opt.half = torch.cuda.is_available() and opt.device != 'cpu'  # FP16 for fastest results
    if opt.task == 'speed':  # speed benchmarks
        # python val.py --task speed --data coco.yaml --batch 1 --weights yolov5n.pt yolov5s.pt...
        opt.conf_thres, opt.iou_thres, opt.save_json = 0.25, 0.45, False
        for opt.weights in weights:
            run(**vars(opt), plots=False)

    elif opt.task == 'study':  # speed vs mAP benchmarks
        # python val.py --task study --data coco.yaml --iou 0.7 --weights yolov5n.pt yolov5s.pt...
        for opt.weights in weights:
            f = f'study_{Path(opt.data).stem}_{Path(opt.weights).stem}.txt'  # filename to save to
            x, y = list(range(256, 1536 + 128, 128)), []  # x axis (image sizes), y axis
            for opt.imgsz in x:  # img-size
                LOGGER.info(f'\nRunning {f} --imgsz {opt.imgsz}...')
                r, _, t = run(**vars(opt), plots=False)
                y.append(r + t)  # results and times
            np.savetxt(f, y, fmt='%10.4g')  # save
        subprocess.run(['zip', '-r', 'study.zip', 'study_*.txt'])
        plot_val_study(x=x)  # plot
    else:
        raise NotImplementedError(f'--task {opt.task} not in ("train", "val", "test", "speed", "study")')

if name == 'main': opt = parse_opt() main(opt)

Additional

No response

glenn-jocher commented 3 weeks ago

@AFallDay hey there! Integrating the calculation of Precision (P), Recall (R), Average Precision (AP), and Mean Average Precision (mAP) from val.py into detect.py involves a bit of work, as these metrics are typically calculated during model validation rather than single image detections.

Here’s a brief rundown on how to start this:

Import Required Functions: You’ll need to import functions that calculate these metrics from val.py. This includes handling True Positives, False Positives, etc.
Modify detect.py: In the detect.py, after detections are made by the model, you should use the imported functions to calculate these metrics. Keep in mind that detecting and validating are different in terms of processing (e.g., you usually have ground truth data in validation).
Integrate Calculation after Detections: After each detection in detect.py, calculate the metrics based on the ground truths of your data (you’ll need to provide this somehow).

Here's a sample code snippet for the run function in detect.py after modifying:

# Assuming your model outputs and ground truth labels are obtained
true_pos, false_pos, false_neg = some_metric_calculation(pred_boxes, true_boxes)

# Calculate Precision, Recall
precision = true_pos / (true_pos + false_pos)
recall = true_pos / (true_pos + false_neg)

# Store or print your calculated metrics
print(f'Precision: {precision}, Recall: {recall}')

For complete step-by-step integration, you need to maintain your modifications aligned with how detections are being processed in your detect.py, especially handling batches of data if you’re doing so.

Remember, this is non-trivial and requires good understanding of how detection and validation work in YOLOv5. Good luck! 😊 If you face specific issues while performing these steps, feel free to raise them with detailed code snippets and errors if any.

AFallDay commented 3 weeks ago

@glenn-jocher Can I just import additional labels for ground truths in detect.py to calculate P, R, AP, MAP metrics?

glenn-jocher commented 1 week ago

Hey there! Yes, you can import additional labels for ground truths in detect.py to calculate metrics like Precision (P), Recall (R), Average Precision (AP), and Mean Average Precision (mAP). However, you'll need to ensure that these labels are correctly aligned with the detections made by the model for accurate metric calculation. You'll also need to modify the detection script to compare these ground truth labels against the model's predictions and then compute the metrics based on this comparison. Keep in mind that this requires careful handling of data formats and the logic for metric calculation. Good luck! 😊

ultralytics / yolov5

May I ask yolov5 how to port the method of calculating P, R, AP, MAP in val.py to adapt to detect.py, what code need to be packed? #12986