Support BLIP ONNX export & runtime

Arrkwen commented 1 year ago

Feature request

I want to export my model to onnx，but an error was happend, like, xxx is not support. such as, if I want export a blip model "Salesforce/blip-image-captioning-large" from the huggingface, it can't export it yet.

Motivation

export model to onnx

Your contribution

submitting a PR

Arrkwen commented 1 year ago

I found it from here: https://huggingface.co/docs/optimum/exporters/onnx/usage_guides/export_a_model#customize-the-export-of-official-transformers-models

fxmarty commented 1 year ago

Thank you, BLIP is in transformers so we could support the ONNX export though. Feel free to submit a PR if you define a config that works!

Arrkwen commented 1 year ago

@fxmarty , config as bellow

from optimum.exporters.onnx import export
from transformers.models.blip import BlipForConditionalGeneration
from optimum.exporters.onnx.model_configs import NormalizedTextAndVisionConfig, TextAndVisionOnnxConfig
from typing import Dict
from pathlib import Path

class BLIPNormalizedConfig(NormalizedTextAndVisionConfig):
    TEXT_CONFIG = "text_config"
    VISION_CONFIG = "vision_config"

# config
class BLIPOnnxConfig(TextAndVisionOnnxConfig):
    NORMALIZED_CONFIG_CLASS = BLIPNormalizedConfig
    @property
    def inputs(self) -> Dict[str, Dict[int, str]]:
        return {
            "pixel_values": {0: "image_batch_size", 1: "num_channels", 2: "height", 3: "width"},
            "input_ids": {0: "text_batch_size", 1: "sequence_length"},
            "attention_mask": {0: "text_batch_size", 1: "sequence_length"},
        }

# export to ONNX
model = BlipForConditionalGeneration.from_pretrained("Salesforce/blip-image-captioning-large")
# print("config:",model.config)
export(
    model=model,
    config=BLIPOnnxConfig(config=model.config, task="image-to-text"),
    output=Path("onnx_model/blip.onnx"),
    device="cuda",
    input_shapes={
        "width":384,
        "height":384
    }
)

an error happend: graph.outputs() is [], stack is flow, can you help me?

/blip/transformers2onnx.py:32 in <module>    │
│                                                                              │
│   29 # export to ONNX                                                        │
│   30 model = BlipForConditionalGeneration.from_pretrained("Salesforce/blip-i │
│   31 # print("config:",model.config)                                         │
│ ❱ 32 export(                                                                 │
│   33 │   model=model,                                                        │
│   34 │   config=BLIPOnnxConfig(config=model.config, task="image-to-text"),   │
│   35 │   output=Path("onnx_model/blip.onnx"),                                │
│                                                                              │
│ /lib/python3.8/site-packages/optimum/exporters/onn │
│ x/convert.py:855 in export                                                   │
│                                                                              │
│   852 │   │   elif dtype is not None:                                        │
│   853 │   │   │   raise ValueError("Unsupported dtype, supported dtypes are: │
│   854 │   │                                                                  │
│ ❱ 855 │   │   export_output = export_pytorch(                                │
│   856 │   │   │   model,                                                     │
│   857 │   │   │   config,                                                    │
│   858 │   │   │   opset,                                                     │
│                                                                              │
│ /lib/python3.8/site-packages/optimum/exporters/onn │
│ x/convert.py:575 in export_pytorch                                           │
│                                                                              │
│   572 │   │   │   with config.patch_model_for_export(model, model_kwargs=mod │
│   573 │   │   │   │   # Export can work with named args but the dict contain │
│   574 │   │   │   │   # tuple.                                               │
│ ❱ 575 │   │   │   │   onnx_export(                                           │
│   576 │   │   │   │   │   model,                                             │
│   577 │   │   │   │   │   (dummy_inputs,),                                   │
│   578 │   │   │   │   │   f=output.as_posix(),                               │
│                                                                              │
│ /lib/python3.8/site-packages/torch/onnx/utils.py:5 │
│ 04 in export                                                                 │
│                                                                              │
│    501 │   │   │   All errors are subclasses of :class:`errors.OnnxExporterE │
│    502 │   """                                                               │
│    503 │                                                                     │
│ ❱  504 │   _export(                                                          │
│    505 │   │   model,                                                        │
│    506 │   │   args,                                                         │
│    507 │   │   f,                                                            │
│                                                                              │
│ /lib/python3.8/site-packages/torch/onnx/utils.py:1 │
│ 529 in _export                                                               │
│                                                                              │
│   1526 │   │   │   │   dynamic_axes = {}                                     │
│   1527 │   │   │   _validate_dynamic_axes(dynamic_axes, model, input_names,  │
│   1528 │   │   │                                                             │
│ ❱ 1529 │   │   │   graph, params_dict, torch_out = _model_to_graph(          │
│   1530 │   │   │   │   model,                                                │
│   1531 │   │   │   │   args,                                                 │
│   1532 │   │   │   │   verbose,                                              │
│                                                                              │
│ /lib/python3.8/site-packages/torch/onnx/utils.py:1 │
│ 161 in _model_to_graph                                                       │
│                                                                              │
│   1158 │   │   │   │   is_script,                                            │
│   1159 │   │   │   )                                                         │
│   1160 │                                                                     │
│ ❱ 1161 │   _set_input_and_output_names(graph, input_names, output_names)     │
│   1162 │   params_dict = _get_named_param_dict(graph, params)                │
│   1163 │                                                                     │
│   1164 │   if training is None or training == _C_onnx.TrainingMode.EVAL:     │
│                                                                              │
│ /lib/python3.8/site-packages/torch/onnx/utils.py:1 │
│ 705 in _set_input_and_output_names                                           │
│                                                                              │
│   1702 │   │   │   if node.debugName() != name:                              │
│   1703 │   │   │   │   node.setDebugName(name)                               │
│   1704 │   set_names(list(graph.inputs()), input_names, "input")             │
│ ❱ 1705 │   set_names(list(graph.outputs()), output_names, "output")          │
│   1706                                                                       │
│   1707                                                                       │
│   1708 @_beartype.beartype                                                   │
│                                                                              │
│ /lib/python3.8/site-packages/torch/onnx/utils.py:1 │
│ 683 in set_names                                                             │
│                                                                              │
│   1680 │   │   if name_list is None:                                         │
│   1681 │   │   │   return                                                    │
│   1682 │   │   if len(name_list) > len(node_list):                           │
│ ❱ 1683 │   │   │   raise RuntimeError(                                       │
│   1684 │   │   │   │   "number of %s names provided (%d) exceeded number of  │
│   1685 │   │   │   │   % (descriptor, len(name_list), descriptor, len(node_l │
│   1686 │   │   │   )

xenova commented 8 months ago

@fxmarty I'd be interested in adding this to transformers.js - do you think it's worth revisiting?

IlyasMoutawwakil commented 5 months ago

@Arrkwen the config you provided works for me. Happy to review it in a PR.

huggingface / optimum