ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
50.45k stars 16.28k forks source link

Exported to ONNX model gives chaotic predictions #1603

Closed PowercoderJr closed 3 years ago

PowercoderJr commented 3 years ago

❔Question

I have exported custom trained yolov5x.pt model to .onnx, for what I did

python models/export.py --weights best.pt --img 640 --batch 1

as it's said in #251. The .pt model gives fine predictions but the .onnx gives random boxes most of which aren't even in [0, 1) range before NMS. Also .pt's raw output before NMS has shape (1, 25200, 23) whereas .onnx's has (1, 3, 80, 80, 23) what is (1, 19200, 23) after reshape. Can it be because the model was trained with batch size 64 and exported with --batch 1? Or do I have a mistake in my script?

Additional context

import cv2
import numpy as np
import onnx
from onnx_tf.backend import prepare
import torch

from utils.datasets import letterbox
from utils.general import non_max_suppression, scale_coords

def main():
    classes = [...]
    image_size = 640
    image_bgr = cv2.imread('image.jpg')
    image = letterbox(image_bgr, new_shape=(image_size, image_size))[0]
    image = cv2.copyMakeBorder(image, 0, image_size - image.shape[0], 0, image_size - image.shape[1],
        borderType=cv2.BORDER_CONSTANT)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB).transpose(2, 0, 1)
    image = image / 255.0
    image = np.expand_dims(image, 0)
    image = np.ascontiguousarray(image)

    onnx_model = onnx.load("./best.onnx")
    out = prepare(onnx_model).run(image.astype(np.float32)).output
    s = out.shape
    out = out.reshape((s[0], s[1] * s[2] * s[3], s[4]))
    result = non_max_suppression(torch.tensor(out), conf_thres=0.25, iou_thres=0.45,
        classes=None, agnostic=False, labels=())[0]
    scaled = scale_coords((1, 1), result[:, :4], image_bgr.shape[1::-1]).round().numpy()
    print(scaled)
    image_boxed = image_bgr.copy()
    for i in range(len(result)):
        if result[i][4] > 0.75:
            x0, y0, x1, y1 = scaled[i]
            cv2.rectangle(image_boxed, (x0, y0), (x1, y1), (0, 0, 255), 2)
    cv2.imshow('boxes', image_boxed)
    cv2.waitKey()

if __name__ == '__main__':
    main()
github-actions[bot] commented 3 years ago

Hello @PowercoderJr, thank you for your interest in 🚀 YOLOv5! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.

For business inquiries or professional support requests please visit https://www.ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com.

Requirements

Python 3.8 or later with all requirements.txt dependencies installed, including torch>=1.7. To install run:

$ pip install -r requirements.txt

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

CI CPU testing

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), testing (test.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

glenn-jocher commented 3 years ago

@PowercoderJr the batch sizes are unrelated (not important). The exported ONNX models are lacking the box reconstruction steps in the Detect() layer. You can try to set this line to False: https://github.com/ultralytics/yolov5/blob/68e6ab668b30a6014215b94e399151f8c76e471a/models/export.py#L50

Also if you are interested in TF .pb export you may also want to see PR https://github.com/ultralytics/yolov5/pull/1127, which performs this directly from a TF2 Keras version of YOLOv5.

PowercoderJr commented 3 years ago

@glenn-jocher it's looking much better with False in that line, thank you! Boxes are higher a little, but I think I just should re-read my code carefully. They're definitely not chaotic now.

PowercoderJr commented 3 years ago

@glenn-jocher I've succeeded to run the model with tf2 by converting ".pt -> .onnx -> saved_model" but do you have any idea why does it say tensorflow.python.framework.errors_impl.InvalidArgumentError: Node 'onnx_tf_prefix_Sigmoid_1244': Unknown input node 'onnx_tf_prefix_Transpose_1243' in case of tf1? It worked exported with model.model[-1].export = True, giving bad predictions though, but with no errors. It's converted with onnx-tf convert -i best.onnx -o best.pb as it's said in https://github.com/onnx/onnx-tensorflow/tree/tf-1.x

graph = tf.Graph()
with graph.as_default():
    with tf.gfile.GFile('./best.pb', 'rb') as f:
        config = tf.ConfigProto()
        config.gpu_options.per_process_gpu_memory_fraction = 0.3
        graph_def = tf.GraphDef()
        graph_def.ParseFromString(f.read())
        tf.import_graph_def(graph_def)
    with tf.Session(graph=graph, config=config) as sess:
        out = sess.run(['import/output:0'], feed_dict={'import/images:0': image})[0]

(the exception comes from line tf.import_graph_def(graph_def))

glenn-jocher commented 3 years ago

@PowercoderJr no, I have no experience with this export pathway.

PowercoderJr commented 3 years ago

Okay, thanks then

Hisan-007 commented 2 years ago

@PowercoderJr the batch sizes are unrelated (not important). The exported ONNX models are lacking the box reconstruction steps in the Detect() layer. You can try to set this line to False:

https://github.com/ultralytics/yolov5/blob/68e6ab668b30a6014215b94e399151f8c76e471a/models/export.py#L50

Also if you are interested in TF .pb export you may also want to see PR #1127, which performs this directly from a TF2 Keras version of YOLOv5.

Can't seem to find the line model.model[-1].export = True # set Detect() layer export=True in export.py. Could you please help ?