ultralytics / ultralytics

NEW - YOLOv8 🚀 in PyTorch > ONNX > OpenVINO > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
28.09k stars 5.58k forks source link

onnx export image size issue #10370

Closed pranavteli closed 3 months ago

pranavteli commented 4 months ago

Search before asking

Question

I have trained yolov8s segmentation model with imgsz of 1280, when I am exporting the model to onnx, with default imgsz(640), its performing well but when I am exporting with the image size I trained on (1280), its showing poor predictions. Also, segmentation outputsize is 4 times less than imgsz which is resulting in lower resolution masks. Is there any way to tackle this issue.

Additionally, I want to use segmentation model inference in C# (dot net), I could see only onnxruntime implementation everywhere for segmentation task even official ultralytics mentions converting the model to onnx, Is there any way to directly use pytorch models in C#. I see pytorch implementation i.e. torchsharp but didnt see any example for any computer vision models.

Additional

No response

github-actions[bot] commented 4 months ago

👋 Hello @pranavteli, thank you for your interest in Ultralytics YOLOv8 🚀! We recommend a visit to the Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Join the vibrant Ultralytics Discord 🎧 community for real-time conversations and collaborations. This platform offers a perfect space to inquire, showcase your work, and connect with fellow Ultralytics users.

Install

Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.

pip install ultralytics

Environments

YOLOv8 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

Ultralytics CI

If this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLOv8 Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

glenn-jocher commented 4 months ago

Hello! 🙌

It sounds like you're experiencing a couple of issues related to exporting and using the YOLOv8 segmentation model. Let's address each one:

  1. ONNX Export with Different Image Sizes: When exporting your model with an image size different from what it was trained on, it's important to ensure the input size during inference matches the export size. Discrepancies in size can lead to poor predictions due to scaling differences. If you've trained your model at a resolution of 1280, try exporting it with the same image size:

    yolo export model=yolov8s-seg.pt format=onnx imgsz=1280,1280
  2. Segmentation Output Size: The segmentation masks having a lower resolution is expected behavior due to the nature of downsampling within the network architecture. To get higher resolution masks, you might consider post-processing steps like upsampling the masks to your desired resolution.

  3. Using PyTorch Models in C# (.NET): As for integrating the segmentation model with C#, while the most straightforward approach is indeed to export the model to ONNX and use it with ONNX Runtime, direct usage of PyTorch models in C# is less common and less documented. TorchSharp is a promising avenue but as you've noticed, examples for computer vision models are scarce. It might require setting up the model architecture manually in C# using TorchSharp and then loading the weights from your trained PyTorch model. This approach, however, can be quite complex and documentation-specific to TorchSharp would be your best reference.

I hope this helps clarify things a bit! If you're tackling the TorchSharp route, it might be worthwhile to reach out in their community forums or documentation for more specific guidance. Good luck with your project! 👍

pranavteli commented 4 months ago

Hello! 🙌

It sounds like you're experiencing a couple of issues related to exporting and using the YOLOv8 segmentation model. Let's address each one:

  1. ONNX Export with Different Image Sizes: When exporting your model with an image size different from what it was trained on, it's important to ensure the input size during inference matches the export size. Discrepancies in size can lead to poor predictions due to scaling differences. If you've trained your model at a resolution of 1280, try exporting it with the same image size:
    yolo export model=yolov8s-seg.pt format=onnx imgsz=1280,1280
  2. Segmentation Output Size: The segmentation masks having a lower resolution is expected behavior due to the nature of downsampling within the network architecture. To get higher resolution masks, you might consider post-processing steps like upsampling the masks to your desired resolution.
  3. Using PyTorch Models in C# (.NET): As for integrating the segmentation model with C#, while the most straightforward approach is indeed to export the model to ONNX and use it with ONNX Runtime, direct usage of PyTorch models in C# is less common and less documented. TorchSharp is a promising avenue but as you've noticed, examples for computer vision models are scarce. It might require setting up the model architecture manually in C# using TorchSharp and then loading the weights from your trained PyTorch model. This approach, however, can be quite complex and documentation-specific to TorchSharp would be your best reference.

I hope this helps clarify things a bit! If you're tackling the TorchSharp route, it might be worthwhile to reach out in their community forums or documentation for more specific guidance. Good luck with your project! 👍

Thank you for taking time to answer this. However, the issue is, I trained model with imgsz=1280, when I am exporting it with default size, its performing well. yolo export model=yolov8s-seg.pt format=onnx But when I am exporting it to the size I trained on its performing worse. yolo export model=yolov8s-seg.pt format=onnx imgsz=1280,1280

Another interesting fact is- its particularly performing worse for class where object is large. Somehow, its dividing that class objects in multiple instances. This issue is only appearing for large imgsz.

Also, is there any way to modify onnx export architecture itself to add upsampling for segmentation map, so that it has same size as input imgsz, can you please guide me in this case.

glenn-jocher commented 4 months ago

Hey there! 👋

Thanks for your follow-up and the detailed explanation. It seems like the performance dip when exporting with the larger image size might relate to how scaling affects the model's perception of object sizes. Here are a couple of insights that might help:

  1. Performance with Large Image Size: The issue you've observed for larger objects getting segmented into multiple instances could be due to how the network's receptive fields interpret spatial relationships at different scales. This is a bit nuanced but not entirely unexpected.

  2. Modifying ONNX Export for Upsampling: Directly modifying the ONNX export to include upsampling is a bit complex since it would involve altering the model's architecture before export. Instead, consider adding an upsampling step as post-processing after inference. If the direct modification is critical, you might need to adjust the PyTorch model architecture to include an upsampling layer towards the end, ensuring the output size matches the input, and then export to ONNX. Here's a basic snippet on including an upsampling layer in PyTorch:

    import torch
    import torch.nn as nn
    from ultralytics import YOLO
    
    # Example model
    model = YOLO('yolov8s-seg.pt').model
    
    # Assuming 'model' is your loaded/separate segmentation network
    # Add an upsampling layer
    upsampling_layer = nn.Upsample(scale_factor=4, mode='bilinear', align_corners=True)
    model.add_module('upsample', upsampling_layer)
    
    # Then proceed with the export
    model.export('yolov8s-seg-upsampled.onnx', imgsz=(1280, 1280))

Given the complexity of integrating these changes, testing is key. Tweaks might be needed depending on the exact architecture of your model.

For the large objects issue, a step back to look at the model's training dynamics with different resolutions could provide further insights. Sometimes, even minute architectural or hyperparameter adjustments can significantly impact.

Hope these pointers help you a bit further down the road! Cheers to your project's success. 🚀

pranavteli commented 4 months ago

Thanks Glenn, I tried export, getting following error.

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
[<ipython-input-2-afa50277e2d2>](https://localhost:8080/#) in <cell line: 14>()
     12 
     13 # Then proceed with the export
---> 14 model.export('yolov8s-seg-upsampled.onnx', imgsz=(1280, 1280))

[/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in __getattr__(self, name)
   1686             if name in modules:
   1687                 return modules[name]
-> 1688         raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
   1689 
   1690     def __setattr__(self, name: str, value: Union[Tensor, 'Module']) -> None:

AttributeError: 'SegmentationModel' object has no attribute export 

I somehow got away through error by modified given code snippet.

import torch
import torch.nn as nn
from ultralytics import YOLO

model = YOLO('yolov8n-seg.pt').model

upsampling_layer = nn.Upsample(scale_factor=4, mode='bilinear', align_corners=True)
model.add_module('upsample', upsampling_layer)
output_names = ["output0", "output1"]

im= torch.zeros(1, 3, 1280, 1280)
torch.onnx.export(
    model,  # dynamic=True only compatible with cpu
    im,
    'yolov8n-seg-upsampled.onnx',
    verbose=False,
    opset_version=17,
    do_constant_folding=True,
    input_names=["images"],
    output_names=output_names,
    dynamic_axes= None,
)

But when I am visualizing the errors now, its having weird outputs image

I even tried saving .pt first and then using ultralytics export.

import torch
import torch.nn as nn
from ultralytics import YOLO
model = YOLO('yolov8s-seg.pt').model
upsampling_layer = nn.Upsample(scale_factor=4, mode='bilinear', align_corners=True)
model.add_module('upsample', upsampling_layer)
torch.save(model.state_dict() ,f='yolov8s-seg-upsampled.pt')

!yolo export model=/content/yolov8s-seg-upsampled.pt format=onnx

Seems its not saving model in compatible format, getting following error.

Traceback (most recent call last):
  File "/usr/local/bin/yolo", line 8, in <module>
    sys.exit(entrypoint())
  File "/usr/local/lib/python3.10/dist-packages/ultralytics/cfg/__init__.py", line 551, in entrypoint
    model = SAM(model)
  File "/usr/local/lib/python3.10/dist-packages/ultralytics/models/sam/model.py", line 47, in __init__
    super().__init__(model=model, task="segment")
  File "/usr/local/lib/python3.10/dist-packages/ultralytics/engine/model.py", line 151, in __init__
    self._load(model, task=task)
  File "/usr/local/lib/python3.10/dist-packages/ultralytics/models/sam/model.py", line 57, in _load
    self.model = build_sam(weights)
  File "/usr/local/lib/python3.10/dist-packages/ultralytics/models/sam/build.py", line 159, in build_sam
    raise FileNotFoundError(f"{ckpt} is not a supported SAM model. Available models are: \n {sam_model_map.keys()}")
FileNotFoundError: /content/yolov8s-seg-upsampled.pt is not a supported SAM model. Available models are: 
 dict_keys(['sam_h.pt', 'sam_l.pt', 'sam_b.pt', 'mobile_sam.pt'])
glenn-jocher commented 4 months ago

Hey there! 👋

It seems like you've run into a bit of a hiccup with the model export step. When you encountered the AttributeError for 'SegmentationModel' object has no attribute export, your workaround to use torch.onnx.export directly is spot-on! 🎯

Concerning the weird outputs after adding the upsampling layer and exporting to ONNX, this can sometimes happen due to layer compatibility issues or how the operations are defined/interpreted. Without seeing the exact output, it might be hard to diagnose, but ensuring that the upsampling_layer aligns with the rest of your model architecture in terms of input/output dimensions could be a starting point.

As for the error related to exporting your .pt model using the yolo CLI, the error seems to indicate that the SAM model configuration expected does not match your custom model. This could mean that when saving and loading the model for exporting, there might be an extra step or consideration needed to match the expected format.

In your particular case, if direct exporting from the .pt file after modification isn't playing nice, sticking to direct ONNX export via PyTorch as you've done might be the more straightforward approach, or adjusting the SAM model compatible format accordingly.

You're doing amazing work pushing through these challenges! Keep tinkering, and don't hesitate to share more details or reach out if you hit another snag. 🚀

pranavteli commented 3 months ago

Finally, I could make this work as expected. Sharing code block if anyone come across similar issue.

import onnx
from onnx import helper, TensorProto

# Load the original ONNX model
model_path = '/modelPath'
model = onnx.load(model_path)

# Define the upsampling layer
upsample_layer = helper.make_node(
    'Resize',
    inputs=['output1', '', 'scales'],  # Empty string for 'roi'
    outputs=['upsampled_segmaps'],
    mode='linear',  # interpolation
    name='Resize'
)

# scales input for Resize, assuming the desired output size is 4 times the input
scales = helper.make_tensor('scales', TensorProto.FLOAT, [4], [1.0, 1.0, 4.0, 4.0])

# Add nodes and tensors to the graph
nodes = list(model.graph.node) + [upsample_layer]
initializers = list(model.graph.initializer) + [scales]

# Define the outputs, keeping output0 and adding the new upsampled_output1
outputs = [
    model.graph.output[0],  # Keep the original output0
    helper.make_tensor_value_info('upsampled_segmaps', TensorProto.FLOAT, [1, 32, 640, 640]) # Adjust based on your model
]

# Create the modified graph
graph = helper.make_graph(
    nodes=nodes,
    name=model.graph.name,
    inputs=model.graph.input,
    outputs=outputs,
    initializer=initializers,
    value_info=model.graph.value_info
)

# Create the modified model
modified_model = helper.make_model(graph)

modified_model.opset_import[0].version = 17  # Ensure the opset version is correct
modified_model.ir_version = 8

# Save the modified model
onnx.save(modified_model, '/outputPath')
glenn-jocher commented 3 months ago

Hey, that's fantastic news! 🎉 Thanks for sharing your solution for adding an upsampling layer to the ONNX model. This will definitely be helpful for others facing similar challenges. Your approach to modifying the graph directly is neat and clear. Great job and happy coding! 🚀