microsoft / onnxruntime-inference-examples

Examples for using ONNX Runtime for machine learning inferencing.
MIT License
1.16k stars 331 forks source link

exporting the vit_b model with sam exporter ? #422

Open flores-o opened 5 months ago

flores-o commented 5 months ago

https://github.com/microsoft/onnxruntime-inference-examples/blob/8fcc97e1e035d57ffdfd19b76732e3fc79d8c2a6/js/segment-anything/index.js#L21

Hi @guschmue, can you share the command used with the sam exporter tool to get this onnx file?

p.s. : I tried exporting the vit_b version with the sam exporter tool and got a larger onnx file (360 MB compared with yours 180 MB) that runs slower in browser with webgpu. Did you convert the model weights to mixed precision/ half precision before exporting it with sam_exporter?

Thank you

SangbumChoi commented 4 months ago

@flores-o I think you can see by the netron that there is an upcast layer in the file.

import torch
import torchvision.models as models

# Load a pre-trained model
model = models.resnet18(pretrained=True)

# Convert the model to half-precision (FP16)
model = model.half()

# Quantize the model parameters to FP16
quantized_model = torch.quantization.quantize_dynamic(
    model, {torch.nn.Linear}, dtype=torch.float16
)

# Export the quantized model to ONNX format
torch.onnx.export(
    quantized_model,
    torch.randn(1, 3, 224, 224).half(),  # Input shape
    "quantized_resnet.onnx",
    input_names=["input"],
    output_names=["output"],
    opset_version=11,
    example_outputs=torch.randn(1, 1000).half()  # Provide example outputs for dynamic axes
)
Screenshot 2024-05-10 at 2 51 02 PM Screenshot 2024-05-10 at 2 48 58 PM