microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
https://aka.ms/GeneralAI
MIT License
19.54k stars 2.49k forks source link

[LayoutLM3] How to export models to onnx format? #1274

Open iweirman opened 1 year ago

iweirman commented 1 year ago

Describe Model I am using (LayoutLM3): I've tried the solutions provided by Detectron2 and Hugging Face, but I haven't been successful in exporting the model for the "Document Layout Analysis on PubLayNet" task to the ONNX format. I'm hoping to receive assistance from the community on this matter.

KananVyas commented 1 year ago

Same issue, I want to convert LayoutLMv3 Object Detection model to ONNX. But not getting any help:

The final error I'm getting is


     41     onnx.save(
     42         onnx_model,
     43         os.path.join("/data/kanan/optimization/triton/v10_triton", "model.onnx"),
     44     )
     47 sample_inputs = get_sample_inputs()
---> 48 exported_model = export_caffe2_tracing(cfg, predictor.model, sample_inputs)

Cell In[10], line 40, in export_caffe2_tracing(cfg, torch_model, inputs)
     36 tracer = Caffe2Tracer(cfg, torch_model, inputs)
     38 import onnx
---> 40 onnx_model = tracer.export_onnx()
     41 onnx.save(
     42     onnx_model,
     43     os.path.join("/data/kanan/optimization/triton/v10_triton", "model.onnx"),
     44 )

File /data/opt/miniconda/envs/venv_detectron2/lib/python3.8/site-packages/detectron2/export/api.py:121, in Caffe2Tracer.export_onnx(self)
    109 """
    110 Export the model to ONNX format.
    111 Note that the exported model contains custom ops only available in caffe2, therefore it
   (...)
    117     onnx.ModelProto: an onnx model.
    118 """
    119 from .caffe2_export import export_onnx_model as export_onnx_model_impl
--> 121 return export_onnx_model_impl(self.traceable_model, (self.traceable_inputs,))

File /data/opt/miniconda/envs/venv_detectron2/lib/python3.8/site-packages/detectron2/export/caffe2_export.py:56, in export_onnx_model(model, inputs)
     54 with torch.no_grad():
     55     with io.BytesIO() as f:
---> 56         torch.onnx.export(
     57             model,
     58             inputs,
     59             f,
     60             operator_export_type=OperatorExportTypes.ONNX_ATEN_FALLBACK,
     61             # verbose=True,  # NOTE: uncomment this for debugging
     62             # export_params=True,
     63         )
     64         onnx_model = onnx.load_from_string(f.getvalue())
     66 # Apply ONNX's Optimization

File /data/opt/miniconda/envs/venv_detectron2/lib/python3.8/site-packages/torch/onnx/__init__.py:316, in export(model, args, f, export_params, verbose, training, input_names, output_names, operator_export_type, opset_version, _retain_param_name, do_constant_folding, example_outputs, strip_doc_string, dynamic_axes, keep_initializers_as_inputs, custom_opsets, enable_onnx_checker, use_external_data_format)
     38 r"""
     39 Exports a model into ONNX format. If ``model`` is not a
     40 :class:`torch.jit.ScriptModule` nor a :class:`torch.jit.ScriptFunction`, this runs
   (...)
    312     model to the file ``f`` even if this is raised.
    313 """
    315 from torch.onnx import utils
--> 316 return utils.export(model, args, f, export_params, verbose, training,
    317                     input_names, output_names, operator_export_type, opset_version,
    318                     _retain_param_name, do_constant_folding, example_outputs,
    319                     strip_doc_string, dynamic_axes, keep_initializers_as_inputs,
    320                     custom_opsets, enable_onnx_checker, use_external_data_format)

File /data/opt/miniconda/envs/venv_detectron2/lib/python3.8/site-packages/torch/onnx/utils.py:107, in export(model, args, f, export_params, verbose, training, input_names, output_names, operator_export_type, opset_version, _retain_param_name, do_constant_folding, example_outputs, strip_doc_string, dynamic_axes, keep_initializers_as_inputs, custom_opsets, enable_onnx_checker, use_external_data_format)
    102 if use_external_data_format is not None:
    103     warnings.warn("`use_external_data_format' is deprecated and ignored. Will be removed in next "
    104                   "PyTorch release. The code will work as it is False if models are not larger than 2GB, "
    105                   "Otherwise set to False because of size limits imposed by Protocol Buffers.")
--> 107 _export(model, args, f, export_params, verbose, training, input_names, output_names,
    108         operator_export_type=operator_export_type, opset_version=opset_version,
    109         do_constant_folding=do_constant_folding, example_outputs=example_outputs,
    110         dynamic_axes=dynamic_axes, keep_initializers_as_inputs=keep_initializers_as_inputs,
    111         custom_opsets=custom_opsets, use_external_data_format=use_external_data_format)

File /data/opt/miniconda/envs/venv_detectron2/lib/python3.8/site-packages/torch/onnx/utils.py:724, in _export(model, args, f, export_params, verbose, training, input_names, output_names, operator_export_type, export_type, example_outputs, opset_version, do_constant_folding, dynamic_axes, keep_initializers_as_inputs, fixed_batch_size, custom_opsets, add_node_names, use_external_data_format, onnx_shape_inference)
    720     dynamic_axes = {}
    721 _validate_dynamic_axes(dynamic_axes, model, input_names, output_names)
    723 graph, params_dict, torch_out = \
--> 724     _model_to_graph(model, args, verbose, input_names,
    725                     output_names, operator_export_type,
    726                     example_outputs, val_do_constant_folding,
    727                     fixed_batch_size=fixed_batch_size,
    728                     training=training,
    729                     dynamic_axes=dynamic_axes)
    731 # TODO: Don't allocate a in-memory string for the protobuf
    732 defer_weight_export = export_type is not ExportTypes.PROTOBUF_FILE

File /data/opt/miniconda/envs/venv_detectron2/lib/python3.8/site-packages/torch/onnx/utils.py:493, in _model_to_graph(model, args, verbose, input_names, output_names, operator_export_type, example_outputs, do_constant_folding, _disable_torch_constant_prop, fixed_batch_size, training, dynamic_axes)
    490 if isinstance(args, (torch.Tensor, int, float, bool)):
    491     args = (args, )
--> 493 graph, params, torch_out, module = _create_jit_graph(model, args)
    495 params_dict = _get_named_param_dict(graph, params)
    497 graph = _optimize_graph(graph, operator_export_type,
    498                         _disable_torch_constant_prop=_disable_torch_constant_prop,
    499                         fixed_batch_size=fixed_batch_size, params_dict=params_dict,
    500                         dynamic_axes=dynamic_axes, input_names=input_names,
    501                         module=module)

File /data/opt/miniconda/envs/venv_detectron2/lib/python3.8/site-packages/torch/onnx/utils.py:437, in _create_jit_graph(model, args)
    435     return graph, params, torch_out, None
    436 else:
--> 437     graph, torch_out = _trace_and_get_graph_from_model(model, args)
    438     state_dict = _unique_state_dict(model)
    439     params = list(state_dict.values())

File /data/opt/miniconda/envs/venv_detectron2/lib/python3.8/site-packages/torch/onnx/utils.py:388, in _trace_and_get_graph_from_model(model, args)
    381 def _trace_and_get_graph_from_model(model, args):
    382 
    383     # A basic sanity check: make sure the state_dict keys are the same
    384     # before and after running the model.  Fail fast!
    385     orig_state_dict_keys = _unique_state_dict(model).keys()
    387     trace_graph, torch_out, inputs_states = \
--> 388         torch.jit._get_trace_graph(model, args, strict=False, _force_outplace=False, _return_inputs_states=True)
    389     warn_on_static_input_change(inputs_states)
    391     if orig_state_dict_keys != _unique_state_dict(model).keys():

File /data/opt/miniconda/envs/venv_detectron2/lib/python3.8/site-packages/torch/jit/_trace.py:1166, in _get_trace_graph(f, args, kwargs, strict, _force_outplace, return_inputs, _return_inputs_states)
   1164 if not isinstance(args, tuple):
   1165     args = (args,)
-> 1166 outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
   1167 return outs

File /data/opt/miniconda/envs/venv_detectron2/lib/python3.8/site-packages/torch/nn/modules/module.py:1102, in Module._call_impl(self, *input, **kwargs)
   1098 # If we don't have any hooks, we want to skip the rest of the logic in
   1099 # this function, and just call forward.
   1100 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102     return forward_call(*input, **kwargs)
   1103 # Do not call functions when jit is used
   1104 full_backward_hooks, non_full_backward_hooks = [], []

File /data/opt/miniconda/envs/venv_detectron2/lib/python3.8/site-packages/torch/jit/_trace.py:127, in ONNXTracedModule.forward(self, *args)
    124     else:
    125         return tuple(out_vars)
--> 127 graph, out = torch._C._create_graph_by_tracing(
    128     wrapper,
    129     in_vars + module_state,
    130     _create_interpreter_name_lookup_fn(),
    131     self.strict,
    132     self._force_outplace,
    133 )
    135 if self._return_inputs:
    136     return graph, outs[0], ret_inputs[0]

File /data/opt/miniconda/envs/venv_detectron2/lib/python3.8/site-packages/torch/jit/_trace.py:118, in ONNXTracedModule.forward.<locals>.wrapper(*args)
    116 if self._return_inputs_states:
    117     inputs_states.append(_unflatten(in_args, in_desc))
--> 118 outs.append(self.inner(*trace_inputs))
    119 if self._return_inputs_states:
    120     inputs_states[0] = (inputs_states[0], trace_inputs)

File /data/opt/miniconda/envs/venv_detectron2/lib/python3.8/site-packages/torch/nn/modules/module.py:1102, in Module._call_impl(self, *input, **kwargs)
   1098 # If we don't have any hooks, we want to skip the rest of the logic in
   1099 # this function, and just call forward.
   1100 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102     return forward_call(*input, **kwargs)
   1103 # Do not call functions when jit is used
   1104 full_backward_hooks, non_full_backward_hooks = [], []

File /data/opt/miniconda/envs/venv_detectron2/lib/python3.8/site-packages/torch/nn/modules/module.py:1090, in Module._slow_forward(self, *input, **kwargs)
   1088         recording_scopes = False
   1089 try:
-> 1090     result = self.forward(*input, **kwargs)
   1091 finally:
   1092     if recording_scopes:

File /data/opt/miniconda/envs/venv_detectron2/lib/python3.8/contextlib.py:75, in ContextDecorator.__call__.<locals>.inner(*args, **kwds)
     72 @wraps(func)
     73 def inner(*args, **kwds):
     74     with self._recreate_cm():
---> 75         return func(*args, **kwds)

File /data/opt/miniconda/envs/venv_detectron2/lib/python3.8/site-packages/detectron2/export/caffe2_modeling.py:268, in Caffe2GeneralizedRCNN.forward(self, inputs)
    266     return self._wrapped_model.inference(inputs)
    267 images = self._caffe2_preprocess_image(inputs)
--> 268 features = self._wrapped_model.backbone(images.tensor)
    269 proposals, _ = self._wrapped_model.proposal_generator(images, features)
    270 with self.roi_heads_patcher.mock_roi_heads():

File /data/opt/miniconda/envs/venv_detectron2/lib/python3.8/site-packages/torch/nn/modules/module.py:1102, in Module._call_impl(self, *input, **kwargs)
   1098 # If we don't have any hooks, we want to skip the rest of the logic in
   1099 # this function, and just call forward.
   1100 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102     return forward_call(*input, **kwargs)
   1103 # Do not call functions when jit is used
   1104 full_backward_hooks, non_full_backward_hooks = [], []

File /data/opt/miniconda/envs/venv_detectron2/lib/python3.8/site-packages/torch/nn/modules/module.py:1090, in Module._slow_forward(self, *input, **kwargs)
   1088         recording_scopes = False
   1089 try:
-> 1090     result = self.forward(*input, **kwargs)
   1091 finally:
   1092     if recording_scopes:

File /data/opt/miniconda/envs/venv_detectron2/lib/python3.8/site-packages/detectron2/modeling/backbone/fpn.py:138, in FPN.forward(self, x)
    125 def forward(self, x):
    126     """
    127     Args:
    128         input (dict[str->Tensor]): mapping feature map name (e.g., "res5") to
   (...)
    136             ["p2", "p3", ..., "p6"].
    137     """
--> 138     bottom_up_features = self.bottom_up(x)
    139     results = []
    140     prev_features = self.lateral_convs[0](bottom_up_features[self.in_features[-1]])

File /data/opt/miniconda/envs/venv_detectron2/lib/python3.8/site-packages/torch/nn/modules/module.py:1102, in Module._call_impl(self, *input, **kwargs)
   1098 # If we don't have any hooks, we want to skip the rest of the logic in
   1099 # this function, and just call forward.
   1100 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102     return forward_call(*input, **kwargs)
   1103 # Do not call functions when jit is used
   1104 full_backward_hooks, non_full_backward_hooks = [], []

File /data/opt/miniconda/envs/venv_detectron2/lib/python3.8/site-packages/torch/nn/modules/module.py:1090, in Module._slow_forward(self, *input, **kwargs)
   1088         recording_scopes = False
   1089 try:
-> 1090     result = self.forward(*input, **kwargs)
   1091 finally:
   1092     if recording_scopes:

File /data/kanan/006_table_region_detection/vdu-region-detection-framework/src/unilm/layoutlmv3/examples/object_detection/ditod/backbone.py:137, in VIT_Backbone.forward(self, x)
    128 """
    129 Args:
    130     x: Tensor of shape (N,C,H,W). H, W must be a multiple of ``self.size_divisibility``.
   (...)
    133     dict[str->Tensor]: names and the corresponding features
    134 """
    135 if "layoutlmv3" in self.name:
    136     return self.backbone.forward(
--> 137         input_ids=x["input_ids"] if "input_ids" in x else None,
    138         bbox=x["bbox"] if "bbox" in x else None,
    139         images=x["images"] if "images" in x else None,
    140         attention_mask=x["attention_mask"] if "attention_mask" in x else None,
    141         # output_hidden_states=True,
    142     )
    143 assert x.dim() == 4, f"VIT takes an input of shape (N, C, H, W). Got {x.shape} instead!"
    144 return self.backbone.forward_features(x)

File /data/opt/miniconda/envs/venv_detectron2/lib/python3.8/site-packages/torch/_tensor.py:705, in Tensor.__contains__(self, element)
    701 if isinstance(element, (torch.Tensor, Number)):
    702     # type hint doesn't understand the __contains__ result array
    703     return (element == self).any().item()  # type: ignore[union-attr]
--> 705 raise RuntimeError(
    706     "Tensor.__contains__ only supports Tensor or scalar, but you passed in a %s." %
    707     type(element)
    708 )

RuntimeError: Tensor.__contains__ only supports Tensor or scalar, but you passed in a <class 'str'>.`
qrsssh commented 1 month ago

Describe Model I am using (LayoutLM3): I've tried the solutions provided by Detectron2 and Hugging Face, but I haven't been successful in exporting the model for the "Document Layout Analysis on PubLayNet" task to the ONNX format. I'm hoping to receive assistance from the community on this matter.

same question, Has it been resolved?

qrsssh commented 4 weeks ago

same question, Has it been resolved?

same question, Has it been resolved?

murilosimao commented 3 weeks ago

same question, Has it been resolved?

same question, Has it been resolved?

I successfully converted the model to ONNX and even applied quantization using the Optimum CLI tool. I used the following commands:

Conversion:

pip install --upgrade --upgrade-strategy eager optimum[onnxruntime]

optimum-cli export onnx -m microsoft/layoutlmv3-large --task token-classification layoutlmv3-large-onnx

Quantization:

optimum-cli onnxruntime quantize --onnx_model layoutlmv3-large-onnx --avx2 -o layoutlmv3-large-onnx-quantized

More info about quantization here

I noticed that RAM usage dropped and the speed increased slightly, though it’s still not as fast as running it on a GPU. I'm currently working on compiling the model to run on AWS INF1 (Inferentia) instances. If anyone has ideas or wants to collaborate, (issue inf1) I’d really appreciate the help.!

qrsssh commented 2 weeks ago

same question, Has it been resolved?

same question, Has it been resolved?

I successfully converted the model to ONNX and even applied quantization using the Optimum CLI tool. I used the following commands:

Conversion:

pip install --upgrade --upgrade-strategy eager optimum[onnxruntime]

optimum-cli export onnx -m microsoft/layoutlmv3-large --task token-classification layoutlmv3-large-onnx

Quantization:

optimum-cli onnxruntime quantize --onnx_model layoutlmv3-large-onnx --avx2 -o layoutlmv3-large-onnx-quantized

More info about quantization here

I noticed that RAM usage dropped and the speed increased slightly, though it’s still not as fast as running it on a GPU. I'm currently working on compiling the model to run on AWS INF1 (Inferentia) instances. If anyone has ideas or wants to collaborate, (issue inf1) I’d really appreciate the help.!

optimum-cli only supports “bin” and other model formats, but the object detection model of layoutlv3 is a “pth” format model. How did you convert the format?