Closed Typiqally closed 2 years ago
Looking at the code, it's not clear to me what exactly is the issue.
@Typiqally - can you give us a minimal example that reproduces this issue?
I appreciate your prompt response, I thought this was simply a known issue that lacked support, which is the reason I did not initially send a snippet. I will provide the steps now:
I'm using MMDeploy, a utility to export models based on the OpenMMLab framework. This utility has an option to convert models from MMLab to CoreML. This option takes the following steps:
During this conversion, the aforementioned exception is thrown, which might be due to an incompatibility with the model. The MMDeploy tool internally calls the following function:
def from_torchscript(torchscript_model: Union[str,
torch.jit.RecursiveScriptModule],
output_file_prefix: str,
input_names: Sequence[str],
output_names: Sequence[str],
input_shapes: Dict,
convert_to: str = 'neuralnetwork',
fp16_mode: bool = False,
skip_model_load: bool = True,
**kwargs):
"""Create a coreml engine from torchscript.
Args:
torchscript_model (Union[str, torch.jit.RecursiveScriptModule]):
The torchscript model to be converted.
output_file_prefix (str): The output file prefix.
input_names (Sequence[str]): The input names of the model.
output_names (Sequence[str]): The output names of the model.
input_shapes (Dict): The input shapes include max_shape, min_shape and
default_shape
convert_to (str, optional): The converted model type, can be
'neuralnetwork' or 'mlprogram'. Defaults to 'neuralnetwork'.
fp16_mode (bool, optional): Convert to fp16 model. Defaults to False.
skip_model_load (bool, optional): Skip model load. Defaults to True.
"""
try:
from mmdeploy.backend.torchscript import get_ops_path
torch.ops.load_library(get_ops_path())
except Exception as e:
get_root_logger().warning(
'Can not load custom ops because:\n'
f'{e}\n'
'Some model might not be able to be converted.')
if isinstance(torchscript_model, str):
torchscript_model = torch.jit.load(torchscript_model)
inputs = []
outputs = []
for name in input_names:
shape = create_shape(name, input_shapes[name])
inputs.append(shape)
for name in output_names:
outputs.append(ct.TensorType(name=name))
if convert_to == 'neuralnetwork':
compute_precision = None
else:
if fp16_mode:
compute_precision = ct.precision.FLOAT16
else:
compute_precision = ct.precision.FLOAT32
mlmodel = ct.convert(
model=torchscript_model,
inputs=inputs, #In my case, [ImageType[name=input, shape=[1, 3, 608, 608], scale=0.00392156862745098, bias=[0, 0, 0], color_layout=ColorLayout.RGB, channel_first=None]]
outputs=outputs, #In my case, [TensorType[name=dets, shape=None, dtype=None], TensorType[name=labels, shape=None, dtype=None]]
compute_precision=compute_precision, #In my case, ComputePrecision.FLOAT32
convert_to=convert_to, #In my case, mlprogram
skip_model_load=False)
suffix = get_model_suffix(convert_to)
output_path = output_file_prefix + suffix
mlmodel.save(output_path)
In my case, I'm trying to convert the faster_rcnn_regnetx-3.2GF_fpn_1x_coco
model to CoreML by using the following command:
python mmdeploy/tools/deploy.py \
mmdeploy/configs/mmdet/detection/detection_coreml_static-800x1344.py \
mmdetection/configs/regnet/faster_rcnn_regnetx-3.2GF_fpn_1x_coco.py \
checkpoints/faster_rcnn_regnetx-3.2GF_fpn_1x_coco_20200517_175927-126fd9bf.pth \
mmdetection/demo/demo.jpg \
--work-dir work_dir/faster_rcnn_regnetx \
--device cpu
I understand that this issue is mostly related to the MMDeploy utility and not CoreML; however, I believe the issue regarding the model not converting might be within this repository. I further debugged the issue and found that the operation where the issue occurs is called topk
. When I remove the assertion for the multiple index axes case, I get the following stack trace, which introduces new information:
Traceback (most recent call last):
File "mmdeploy/tools/deploy.py", line 461, in <module>
main()
File "mmdeploy/tools/deploy.py", line 407, in main
from_torchscript(torchscript_path, output_file_prefix,
File "/Users/typically/Workspace/vbti-plant-morphology/mmdeploy/mmdeploy/apis/core/pipeline_manager.py", line 356, in _wrap
return self.call_function(func_name_, *args, **kwargs)
File "/Users/typically/Workspace/vbti-plant-morphology/mmdeploy/mmdeploy/apis/core/pipeline_manager.py", line 326, in call_function
return self.call_function_local(func_name, *args, **kwargs)
File "/Users/typically/Workspace/vbti-plant-morphology/mmdeploy/mmdeploy/apis/core/pipeline_manager.py", line 275, in call_function_local
return pipe_caller(*args, **kwargs)
File "/Users/typically/Workspace/vbti-plant-morphology/mmdeploy/mmdeploy/apis/core/pipeline_manager.py", line 107, in __call__
ret = func(*args, **kwargs)
File "/Users/typically/Workspace/vbti-plant-morphology/mmdeploy/mmdeploy/backend/coreml/torchscript2coreml.py", line 106, in from_torchscript
mlmodel = ct.convert(
File "/opt/homebrew/Caskroom/miniconda/base/envs/mmlabs/lib/python3.8/site-packages/coremltools/converters/_converters_entry.py", line 451, in convert
mlmodel = mil_convert(
File "/opt/homebrew/Caskroom/miniconda/base/envs/mmlabs/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 193, in mil_convert
return _mil_convert(model, convert_from, convert_to, ConverterRegistry, MLModel, compute_units, **kwargs)
File "/opt/homebrew/Caskroom/miniconda/base/envs/mmlabs/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 220, in _mil_convert
proto, mil_program = mil_convert_to_proto(
File "/opt/homebrew/Caskroom/miniconda/base/envs/mmlabs/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 283, in mil_convert_to_proto
prog = frontend_converter(model, **kwargs)
File "/opt/homebrew/Caskroom/miniconda/base/envs/mmlabs/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 115, in __call__
return load(*args, **kwargs)
File "/opt/homebrew/Caskroom/miniconda/base/envs/mmlabs/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/load.py", line 53, in load
return _perform_torch_convert(converter, debug)
File "/opt/homebrew/Caskroom/miniconda/base/envs/mmlabs/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/load.py", line 92, in _perform_torch_convert
prog = converter.convert()
File "/opt/homebrew/Caskroom/miniconda/base/envs/mmlabs/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/converter.py", line 269, in convert
convert_nodes(self.context, self.graph)
File "/opt/homebrew/Caskroom/miniconda/base/envs/mmlabs/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/ops.py", line 92, in convert_nodes
add_op(context, node)
File "/opt/homebrew/Caskroom/miniconda/base/envs/mmlabs/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/ops.py", line 3203, in index
indices = mb.stack(values=valid_indices, axis=indices_rank)
File "/opt/homebrew/Caskroom/miniconda/base/envs/mmlabs/lib/python3.8/site-packages/coremltools/converters/mil/mil/ops/registry.py", line 172, in add_op
return cls._add_op(op_cls_to_add, **kwargs)
File "/opt/homebrew/Caskroom/miniconda/base/envs/mmlabs/lib/python3.8/site-packages/coremltools/converters/mil/mil/builder.py", line 191, in _add_op
new_op.type_value_inference()
File "/opt/homebrew/Caskroom/miniconda/base/envs/mmlabs/lib/python3.8/site-packages/coremltools/converters/mil/mil/operation.py", line 241, in type_value_inference
output_types = self.type_inference()
File "/opt/homebrew/Caskroom/miniconda/base/envs/mmlabs/lib/python3.8/site-packages/coremltools/converters/mil/mil/ops/defs/iOS15/tensor_operation.py", line 1236, in type_inference
raise ValueError(msg.format(t.name, t.shape, t_shape))
ValueError: Component tensor topk_inds0.1 has shape (1, 1000), others have (1, 1)
As I said before, I believe this is not currently supported, and I would appreciate if you could provide me with information about when support for this will be available.
Looking further into it, the previously mentioned model uses multi class non-max suppression, which might be causing the issue.
It's still not clear what specifically is the issue here. The next step to getting this resolved is having a simple, standalone example that reproduces the problem, i.e. something which could become a unit tests and doesn't require downloading an external model.
I'm sorry, but I believe the issue was actually with MMDeploy after all. Their two stage detector was broken after a merge causing the issue mentioned above, see https://github.com/open-mmlab/mmdeploy/issues/1038.
I'm still not completely certain as to why this happens, and sadly, due to their framework's integration, I'm unable to provide a sufficient code snippet without requiring external code or models.
When converting the Faster R-CNN RegNetX-3.2GF-FPN model from TorchScript to CoreML I get the following error:
Are there any plans to implement this in the near future? If so, in what kind of time frame is it expected?