huggingface / optimum

🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools
https://huggingface.co/docs/optimum/main/
Apache License 2.0
2.54k stars 455 forks source link

Mask2Former Support Requested #1063

Open Denny-kef opened 1 year ago

Denny-kef commented 1 year ago

Feature request

It would be great to get Mask2Former support with Optimum! Are there any obstacles to this based on the model itself compared to others?

Motivation

Mask2Former is becoming extremely popular for image segmentation. It would be extremely valuable to have it as a part of optimum.

Your contribution

I will be looking into optimum more and am happy to submit a PR if I have a useful contribution.

regisss commented 1 year ago

Hi @Denny-kef! I assume you're talking about adding support for ONNX export here. I could see this for loop depending on the batch size messing up with dynamic shapes. But static batch sizes should work! Feel free to open a PR following our guide and we'll try to assist you if there is any issue :slightly_smiling_face:

vjsrinivas commented 6 months ago

@regisss @Denny-kef Anyone working on this request? According to Optimum list, it's still not supported for ONNX exporting.

I can attempt a PR if no one else is working on it.

fxmarty commented 6 months ago

Hi @vjsrinivas, feel free to open a PR, happy to help there.

vjsrinivas commented 6 months ago

@fxmarty I took a stab at MaskFormer since they share a very similar structure, but I'm seeing issues regarding torch tracing through the network. Also, I get an error with optimum's export parameters, but a 400MB onnx file when directly exporting with torch.onnx.export. Tried on both pytorch 1.13 and 2.2.2 on CUDA 11.8 with the same results.

Any advice? I feel like there needs to be some changes to the model code in transformers for this exporting to work properly.


model_configs.py

class MaskFormerConfig(YolosOnnxConfig): # Basing inheritance off of SegFormer
    pass

tasks.py

"maskformer": supported_tasks_mapping(
            "feature-extraction",
            "image-segmentation",
            "semantic-segmentation",
            onnx="MaskFormerConfig"
        )

Trace warnings

/home/vijay/anaconda3/envs/hugging/lib/python3.10/site-packages/transformers/models/maskformer/modeling_maskformer_swin.py:212: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if num_channels != self.num_channels:
/home/vijay/anaconda3/envs/hugging/lib/python3.10/site-packages/transformers/models/maskformer/modeling_maskformer_swin.py:202: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if width % self.patch_size[1] != 0:
/home/vijay/anaconda3/envs/hugging/lib/python3.10/site-packages/transformers/models/maskformer/modeling_maskformer_swin.py:205: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if height % self.patch_size[0] != 0:
/home/vijay/anaconda3/envs/hugging/lib/python3.10/site-packages/transformers/models/maskformer/modeling_maskformer_swin.py:574: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  was_padded = pad_values[3] > 0 or pad_values[5] > 0
/home/vijay/anaconda3/envs/hugging/lib/python3.10/site-packages/transformers/models/maskformer/modeling_maskformer_swin.py:575: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if was_padded:
/home/vijay/anaconda3/envs/hugging/lib/python3.10/site-packages/transformers/models/maskformer/modeling_maskformer_swin.py:248: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  should_pad = (height % 2 == 1) or (width % 2 == 1)
/home/vijay/anaconda3/envs/hugging/lib/python3.10/site-packages/transformers/models/maskformer/modeling_maskformer_swin.py:249: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if should_pad:
/home/vijay/anaconda3/envs/hugging/lib/python3.10/site-packages/transformers/models/maskformer/modeling_maskformer.py:543: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attn_weights.size() != (batch_size * self.num_heads, target_len, source_len):
/home/vijay/anaconda3/envs/hugging/lib/python3.10/site-packages/transformers/models/maskformer/modeling_maskformer.py:574: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attn_output.size() != (batch_size * self.num_heads, target_len, self.head_dim):

Optimum export error

Traceback (most recent call last):
  File "/home/vijay/Documents/devmk6/optimum/export_maskformer.py", line 31, in <module>
    onnx_inputs, onnx_outputs = export(base_model, onnx_config, onnx_path, onnx_config.DEFAULT_ONNX_OPSET)
  File "/home/vijay/Documents/devmk6/optimum/optimum/exporters/onnx/convert.py", line 880, in export
    export_output = export_pytorch(
  File "/home/vijay/Documents/devmk6/optimum/optimum/exporters/onnx/convert.py", line 582, in export_pytorch
    onnx_export(
  File "/home/vijay/anaconda3/envs/hugging/lib/python3.10/site-packages/torch/onnx/utils.py", line 504, in export
    _export(
  File "/home/vijay/anaconda3/envs/hugging/lib/python3.10/site-packages/torch/onnx/utils.py", line 1529, in _export
    graph, params_dict, torch_out = _model_to_graph(
  File "/home/vijay/anaconda3/envs/hugging/lib/python3.10/site-packages/torch/onnx/utils.py", line 1161, in _model_to_graph
    _set_input_and_output_names(graph, input_names, output_names)
  File "/home/vijay/anaconda3/envs/hugging/lib/python3.10/site-packages/torch/onnx/utils.py", line 1706, in _set_input_and_output_names
    set_names(list(graph.outputs()), output_names, "output")
  File "/home/vijay/anaconda3/envs/hugging/lib/python3.10/site-packages/torch/onnx/utils.py", line 1683, in set_names
    raise RuntimeError(
RuntimeError: number of output names provided (1) exceeded number of outputs (0)

400 MB ONNX file reproduction

from transformers import AutoModel
from torch.onnx import export as onnx_export
import torch
import torch.nn as nn
import onnx

#base_model = AutoModel.from_pretrained("nvidia/mit-b0") # works
base_model = AutoModel.from_pretrained("facebook/maskformer-swin-base-coco") # produces huge unusable onnx file
#base_model = AutoModel.from_pretrained("facebook/mask2former-swin-small-coco-instance") # fails 
base_model = base_model.eval()

input_tensor = torch.randn((1,3,224,224)).float()
output_dict = base_model(input_tensor)

  onnx_export(base_model, input_tensor, "test.onnx", export_params=True, do_constant_folding=True, input_names=["pixel_values"], output_names=["encoder_last_hidden_state"], opset_version=16)

onnx_model = onnx.load("test.onnx")
onnx.checker.check_model(onnx_model)