ValueError: x1 must be greater than or equal to x0, when use the val.py to val the onnx model #12473

Closed dengxiongshi closed 9 months ago

dengxiongshi commented 10 months ago

Search before asking


When I use the val.py to val the onnx model, I get the error:

(NN) D:\python_work\yolov5>python val.py --weights runs\train\WI_PRW_SSW_SSM_20231127\weights\best_train.onnx --device 0 --name train_mode
val: data=E:\downloads\compress\datasets\train_data\train_data.yaml, weights=['runs\\train\\WI_PRW_SSW_SSM_20231127\\weights\\best_train.onnx'], batch_size=16, imgsz=640, conf_thres=0.001, iou_thres=0.6, max_det=300, task=val, device=0, workers=0, single_cls=False, augment=False, verbose=False, save_txt=Fal
se, save_hybrid=False, save_conf=False, save_json=False, project=runs\val, name=train_mode, exist_ok=False, half=False, dnn=False
YOLOv5  v7.0-240-g84ec8b5 Python-3.8.18 torch-1.9.1+cu111 CUDA:0 (GeForce RTX 2060, 6144MiB)

Loading runs\train\WI_PRW_SSW_SSM_20231127\weights\best_train.onnx for ONNX Runtime inference...
Forcing --batch-size 1 square inference (1,3,640,640) for non-PyTorch models
val: Scanning E:\downloads\compress\datasets\train_data\labels\val.cache... 2575 images, 0 backgrounds, 0 corrupt: 100%|██████████| 2575/2575 [00:00<?, ?it/s]
                 Class     Images  Instances          P          R      mAP50   mAP50-95:   0%|          | 1/2575 [00:00<05:14,  8.18it/s]Exception in thread Thread-3:
Traceback (most recent call last):
  File "D:\Anaconda3\envs\NN\lib\threading.py", line 932, in _bootstrap_inner
  File "D:\Anaconda3\envs\NN\lib\threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "D:\python_work\yolov5\utils\plots.py", line 175, in plot_images
    annotator.box_label(box, label, color=color)
  File "D:\Anaconda3\envs\NN\lib\site-packages\ultralytics\utils\plotting.py", line 108, in box_label
    self.draw.rectangle(box, width=self.lw, outline=color)  # box
  File "D:\Anaconda3\envs\NN\lib\site-packages\PIL\ImageDraw.py", line 294, in rectangle
  File "D:\Anaconda3\envs\NN\lib\threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "D:\python_work\yolov5\utils\plots.py", line 175, in plot_images
    annotator.box_label(box, label, color=color)
  File "D:\Anaconda3\envs\NN\lib\site-packages\ultralytics\utils\plotting.py", line 108, in box_label
    self.draw.rectangle(box, width=self.lw, outline=color)  # box
  File "D:\Anaconda3\envs\NN\lib\site-packages\PIL\ImageDraw.py", line 294, in rectangle
    self.draw.draw_rectangle(xy, ink, 0, width)
ValueError: x1 must be greater than or equal to x0
                 Class     Images  Instances          P          R      mAP50   mAP50-95:   0%|          | 3/2575 [00:00<03:04, 13.96it/s]Exception in thread Thread-7:
Traceback (most recent call last):
  File "D:\Anaconda3\envs\NN\lib\threading.py", line 932, in _bootstrap_inner
  File "D:\Anaconda3\envs\NN\lib\threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "D:\python_work\yolov5\utils\plots.py", line 175, in plot_images
    annotator.box_label(box, label, color=color)
  File "D:\Anaconda3\envs\NN\lib\site-packages\ultralytics\utils\plotting.py", line 108, in box_label
    self.draw.rectangle(box, width=self.lw, outline=color)  # box
  File "D:\Anaconda3\envs\NN\lib\site-packages\PIL\ImageDraw.py", line 294, in rectangle
    self.draw.draw_rectangle(xy, ink, 0, width)
**ValueError: x1 must be greater than or equal to x0**
                 Class     Images  Instances          P          R      mAP50   mAP50-95: 100%|██████████| 2575/2575 [01:23<00:00, 30.71it/s]
                   all       2575      30443          0          0          0          0
Speed: 0.4ms pre-process, 12.5ms inference, 0.8ms NMS per image at shape (1, 3, 640, 640)
Results saved to runs\val\train_mode2


The environment is Python 3.8 and windows10. The package is follow:

dengxiongshi commented 10 months ago

The part is the export.py code I changed :

def parse_opt(known=False):
    parser = argparse.ArgumentParser()
    parser.add_argument('--data', type=str, default=ROOT / 'data/coco128.yaml', help='dataset.yaml path')
    parser.add_argument('--weights', nargs='+', type=str, default=ROOT / 'yolov5s.pt', help='model.pt path(s)')
    parser.add_argument('--imgsz', '--img', '--img-size', nargs='+', type=int, default=[640, 640], help='image (h, w)')
    parser.add_argument('--batch-size', type=int, default=1, help='batch size')
    parser.add_argument('--device', default='cpu', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
    parser.add_argument('--half', action='store_true', help='FP16 half-precision export')
    parser.add_argument('--inplace', action='store_true', help='set YOLOv5 Detect() inplace=True')
    **parser.add_argument('--train', action='store_true', help='model.train() mode')**
    parser.add_argument('--keras', action='store_true', help='TF: use Keras')
    parser.add_argument('--optimize', action='store_true', help='TorchScript: optimize for mobile')
    parser.add_argument('--int8', action='store_true', help='CoreML/TF/OpenVINO INT8 quantization')
    parser.add_argument('--dynamic', action='store_true', help='ONNX/TF/TensorRT: dynamic axes')
    parser.add_argument('--simplify', action='store_true', help='ONNX: simplify model')
    parser.add_argument('--opset', type=int, default=10, help='ONNX: opset version')
    parser.add_argument('--verbose', action='store_true', help='TensorRT: verbose log')
    parser.add_argument('--workspace', type=int, default=4, help='TensorRT: workspace size (GB)')
    parser.add_argument('--nms', action='store_true', help='TF: add NMS to model')
    parser.add_argument('--agnostic-nms', action='store_true', help='TF: add agnostic NMS to model')
    parser.add_argument('--topk-per-class', type=int, default=100, help='TF.js NMS: topk per class to keep')
    parser.add_argument('--topk-all', type=int, default=100, help='TF.js NMS: topk for all classes to keep')
    parser.add_argument('--iou-thres', type=float, default=0.45, help='TF.js NMS: IoU threshold')
    parser.add_argument('--conf-thres', type=float, default=0.25, help='TF.js NMS: confidence threshold')
        help='torchscript, onnx, openvino, engine, coreml, saved_model, pb, tflite, edgetpu, tfjs, paddle')
    opt = parser.parse_known_args()[0] if known else parser.parse_args()
    return opt

In run function:

 # Update model
    # model.eval()
    **model.train() if train else model.eval()**
    for k, m in model.named_modules():
        if isinstance(m, Detect):
            m.inplace = inplace
            m.dynamic = dynamic
            m.export = True

    for _ in range(2):
        y = model(im)  # dry runs
    if half and not coreml:
        im, model = im.half(), model.half()  # to FP16
    # shape = tuple((y[0] if isinstance(y, tuple) else y).shape)  # model output shape
    **shape = tuple(y[0].shape)**
    metadata = {'stride': int(max(model.stride)), 'names': model.names}  # model metadata
    LOGGER.info(f"\n{colorstr('PyTorch:')} starting from {file} with output shape {shape} ({file_size(file):.1f} MB)")

In export_onnx function:

        model.cpu() if dynamic else model,  # --dynamic only compatible with cpu
        im.cpu() if dynamic else im,
        training=torch.onnx.TrainingMode.TRAINING if train else torch.onnx.TrainingMode.EVAL,
        do_constant_folding=not train,  # WARNING: DNN inference with torch>=1.12 may require do_constant_folding=False
        dynamic_axes=dynamic or None)

I use the yolov5-7.0 before add the --train in export.py, the export.py code is change like the yolov5-6.2. The first export onnx is:

(NN) D:\python_work\yolov5>python export.py --weights runs\train\WI_PRW_SSW_SSM_20231127\weights\best.pt --train --simplify --opset 10
export: data=D:\python_work\yolov5\data\coco128.yaml, weights=['runs\\train\\WI_PRW_SSW_SSM_20231127\\weights\\best.pt'], imgsz=[640, 640], batch_size=1, device=cpu, half=False, inplace=False, train=True, keras=False, optimize=False, int8=False, dynamic=False, simplify=True, opset=10, verbose=False, workspa
ce=4, nms=False, agnostic_nms=False, topk_per_class=100, topk_all=100, iou_thres=0.45, conf_thres=0.25, include=['onnx']
YOLOv5  v7.0-240-g84ec8b5 Python-3.8.18 torch-1.9.1+cu111 CPU

Fusing layers... 
YOLOv5s_hs summary: 157 layers, 7351674 parameters, 0 gradients, 17.5 GFLOPs

PyTorch: starting from runs\train\WI_PRW_SSW_SSM_20231127\weights\best.pt with output shape (1, 3, 80, 80, 10) (14.3 MB)

ONNX: starting export with onnx 1.12.0...
ONNX: simplifying with onnx-simplifier 0.4.33...
ONNX: export success  2.2s, saved as runs\train\WI_PRW_SSW_SSM_20231127\weights\best.onnx (28.1 MB)

Export complete (2.6s)
Results saved to D:\python_work\yolov5\runs\train\WI_PRW_SSW_SSM_20231127\weights
Detect:          python detect.py --weights runs\train\WI_PRW_SSW_SSM_20231127\weights\best.onnx
Validate:        python val.py --weights runs\train\WI_PRW_SSW_SSM_20231127\weights\best.onnx
PyTorch Hub:     model = torch.hub.load('ultralytics/yolov5', 'custom', 'runs\train\WI_PRW_SSW_SSM_20231127\weights\best.onnx')
Visualize:       https://netron.app

I got three outputs: image the second export onnx is:

(NN) D:\python_work\yolov5>python export.py --weights runs\train\WI_PRW_SSW_SSM_20231127\weights\best.pt --simplify --opset 10
export: data=D:\python_work\yolov5\data\coco128.yaml, weights=['runs\\train\\WI_PRW_SSW_SSM_20231127\\weights\\best.pt'], imgsz=[640, 640], batch_size=1, device=cpu, half=False, inplace=False, train=False, keras=False, optimize=False, int8=False, dynamic=False, simplify=True, opset=10, verbose=False, worksp
ace=4, nms=False, agnostic_nms=False, topk_per_class=100, topk_all=100, iou_thres=0.45, conf_thres=0.25, include=['onnx']
YOLOv5  v7.0-240-g84ec8b5 Python-3.8.18 torch-1.9.1+cu111 CPU

Fusing layers...
YOLOv5s_hs summary: 157 layers, 7351674 parameters, 0 gradients, 17.5 GFLOPs

PyTorch: starting from runs\train\WI_PRW_SSW_SSM_20231127\weights\best.pt with output shape (1, 25200, 10) (14.3 MB)

ONNX: starting export with onnx 1.12.0...
ONNX: simplifying with onnx-simplifier 0.4.33...
ONNX: export success  2.3s, saved as runs\train\WI_PRW_SSW_SSM_20231127\weights\best.onnx (28.5 MB)

Export complete (2.8s)
Results saved to D:\python_work\yolov5\runs\train\WI_PRW_SSW_SSM_20231127\weights
Detect:          python detect.py --weights runs\train\WI_PRW_SSW_SSM_20231127\weights\best.onnx
Validate:        python val.py --weights runs\train\WI_PRW_SSW_SSM_20231127\weights\best.onnx
PyTorch Hub:     model = torch.hub.load('ultralytics/yolov5', 'custom', 'runs\train\WI_PRW_SSW_SSM_20231127\weights\best.onnx')
Visualize:       https://netron.app

I got one output: image

dengxiongshi commented 10 months ago

When I use the onnx file to test the accuracy by val.py. The first onnx get error:

(NN) D:\python_work\yolov5>python val.py --weights runs\train\WI_PRW_SSW_SSM_20231127\weights\best_train.onnx --device 0 --name train_mode
val: data=E:\downloads\compress\datasets\train_data\train_data.yaml, weights=['runs\\train\\WI_PRW_SSW_SSM_20231127\\weights\\best_train.onnx'], batch_size=16, imgsz=640, conf_thres=0.001, iou_thres=0.6, max_det=300, task=val, device=0, workers=0, single_cls=False, augment=False, verbose=False, save_txt=Fal
se, save_hybrid=False, save_conf=False, save_json=False, project=runs\val, name=train_mode, exist_ok=False, half=False, dnn=False
YOLOv5  v7.0-240-g84ec8b5 Python-3.8.18 torch-1.9.1+cu111 CUDA:0 (GeForce RTX 2060, 6144MiB)

Loading runs\train\WI_PRW_SSW_SSM_20231127\weights\best_train.onnx for ONNX Runtime inference...
Forcing --batch-size 1 square inference (1,3,640,640) for non-PyTorch models
val: Scanning E:\downloads\compress\datasets\train_data\labels\val.cache... 2575 images, 0 backgrounds, 0 corrupt: 100%|██████████| 2575/2575 [00:00<?, ?it/s]
                 Class     Images  Instances          P          R      mAP50   mAP50-95:   0%|          | 1/2575 [00:00<05:14,  8.18it/s]Exception in thread Thread-3:
Traceback (most recent call last):
  File "D:\Anaconda3\envs\NN\lib\threading.py", line 932, in _bootstrap_inner
  File "D:\Anaconda3\envs\NN\lib\threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "D:\python_work\yolov5\utils\plots.py", line 175, in plot_images
    annotator.box_label(box, label, color=color)
  File "D:\Anaconda3\envs\NN\lib\site-packages\ultralytics\utils\plotting.py", line 108, in box_label
    self.draw.rectangle(box, width=self.lw, outline=color)  # box
  File "D:\Anaconda3\envs\NN\lib\threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "D:\python_work\yolov5\utils\plots.py", line 175, in plot_images
    annotator.box_label(box, label, color=color)
  File "D:\Anaconda3\envs\NN\lib\site-packages\ultralytics\utils\plotting.py", line 108, in box_label
    self.draw.rectangle(box, width=self.lw, outline=color)  # box
  File "D:\Anaconda3\envs\NN\lib\site-packages\PIL\ImageDraw.py", line 294, in rectangle
    self.draw.draw_rectangle(xy, ink, 0, width)
ValueError: x1 must be greater than or equal to x0
                 Class     Images  Instances          P          R      mAP50   mAP50-95:   0%|          | 3/2575 [00:00<03:04, 13.96it/s]Exception in thread Thread-7:
Traceback (most recent call last):
  File "D:\Anaconda3\envs\NN\lib\threading.py", line 932, in _bootstrap_inner
  File "D:\Anaconda3\envs\NN\lib\threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "D:\python_work\yolov5\utils\plots.py", line 175, in plot_images
    annotator.box_label(box, label, color=color)
  File "D:\Anaconda3\envs\NN\lib\site-packages\ultralytics\utils\plotting.py", line 108, in box_label
    self.draw.rectangle(box, width=self.lw, outline=color)  # box
  File "D:\Anaconda3\envs\NN\lib\site-packages\PIL\ImageDraw.py", line 294, in rectangle
    self.draw.draw_rectangle(xy, ink, 0, width)
ValueError: x1 must be greater than or equal to x0
                 Class     Images  Instances          P          R      mAP50   mAP50-95: 100%|██████████| 2575/2575 [01:23<00:00, 30.71it/s]
                   all       2575      30443          0          0          0          0
Speed: 0.4ms pre-process, 12.5ms inference, 0.8ms NMS per image at shape (1, 3, 640, 640)
Results saved to runs\val\train_mode2

the second onnx can get success:

(NN) D:\python_work\yolov5>python val.py --weights runs\train\WI_PRW_SSW_SSM_20231127\weights\best.onnx --device 0 --name train_no
val: data=E:\downloads\compress\datasets\train_data\train_data.yaml, weights=['runs\\train\\WI_PRW_SSW_SSM_20231127\\weights\\best.onnx'], batch_size=16, imgsz=640, conf_thres=0.001, iou_thres=0.6, max_det=300, task=val, device=0, workers=0, single_cls=False, augment=False, verbose=False, save_txt=False, sa
ve_hybrid=False, save_conf=False, save_json=False, project=runs\val, name=train_no, exist_ok=False, half=False, dnn=False
YOLOv5  v7.0-240-g84ec8b5 Python-3.8.18 torch-1.9.1+cu111 CUDA:0 (GeForce RTX 2060, 6144MiB)

Loading runs\train\WI_PRW_SSW_SSM_20231127\weights\best.onnx for ONNX Runtime inference...
Forcing --batch-size 1 square inference (1,3,640,640) for non-PyTorch models
val: Scanning E:\downloads\compress\datasets\train_data\labels\val.cache... 2575 images, 0 backgrounds, 0 corrupt: 100%|██████████| 2575/2575 [00:00<?, ?it/s]
                 Class     Images  Instances          P          R      mAP50   mAP50-95: 100%|██████████| 2575/2575 [01:32<00:00, 27.76it/s]
                   all       2575      30443      0.807      0.719      0.771       0.51
                  face       2575       6954      0.835      0.687      0.743      0.352
                person       2575      19192      0.814      0.769      0.795      0.471
                   car       2575       4012      0.868      0.833      0.888      0.671
                   bus       2575        187      0.799      0.791      0.835      0.616
                 truck       2575         98      0.717      0.517      0.597      0.439
Speed: 0.4ms pre-process, 12.6ms inference, 1.0ms NMS per image at shape (1, 3, 640, 640)
Results saved to runs\val\train_no2

The pt file also get the right result:

(NN) D:\python_work\yolov5>python val.py --weights runs\train\WI_PRW_SSW_SSM_20231127\weights\best.pt --device 0 --name best_pt
val: data=E:\downloads\compress\datasets\train_data\train_data.yaml, weights=['runs\\train\\WI_PRW_SSW_SSM_20231127\\weights\\best.pt'], batch_size=16, imgsz=640, conf_thres=0.001, iou_thres=0.6, max_det=300, task=val, device=0, workers=0, single_cls=False, augment=False, verbose=False, save_txt=False, save
_hybrid=False, save_conf=False, save_json=False, project=runs\val, name=best_pt, exist_ok=False, half=False, dnn=False
YOLOv5  v7.0-240-g84ec8b5 Python-3.8.18 torch-1.9.1+cu111 CUDA:0 (GeForce RTX 2060, 6144MiB)

Fusing layers...
YOLOv5s_hs summary: 157 layers, 7351674 parameters, 0 gradients, 17.5 GFLOPs
val: Scanning E:\downloads\compress\datasets\train_data\labels\val.cache... 2575 images, 0 backgrounds, 0 corrupt: 100%|██████████| 2575/2575 [00:00<?, ?it/s]
                 Class     Images  Instances          P          R      mAP50   mAP50-95: 100%|██████████| 161/161 [01:12<00:00,  2.21it/s]
                   all       2575      30443      0.826      0.717      0.774      0.513
                  face       2575       6954      0.836      0.683      0.741      0.352
                person       2575      19192      0.826      0.762      0.796      0.473
                   car       2575       4012      0.869      0.832      0.889      0.678
                   bus       2575        187      0.835      0.783      0.831      0.623
                 truck       2575         98      0.762      0.524      0.614      0.441
Speed: 0.1ms pre-process, 4.2ms inference, 0.7ms NMS per image at shape (16, 3, 640, 640)
Results saved to runs\val\best_pt

I alse get the same question when use the yolov5-6.2. Another, how can get one output by reshape and concat from three outputs in the first export onnx, my pt file is here. The first onnx file is best_train.zip, the second onnx file is best.zip

glenn-jocher commented 10 months ago

@dengxiongshi it looks like you encountered an error while trying to validate your ONNX model using val.py. The issue seems to occur with your first ONNX model, while the second ONNX model and the PyTorch (pt) model generated successful results.

Regarding your question about reshaping and concatenating the three outputs in the first exported ONNX file, you might find it helpful to refer to the Ultralytics YOLOv5 documentation for guidance on working with ONNX models and managing model outputs.

It's great to see you've successfully obtained results with the second ONNX model and the PyTorch model! If you need further assistance in troubleshooting the issue with the first ONNX model, feel free to provide additional details, and the community will be happy to help.

