Closed Typiqally closed 1 year ago
Thanks for the notification and sorry for making you trouble.
We have done some tricks on torch.topk
to fix the GPU export problem. I guess that fix leads to the error.
I have created a rewriter for CoreML topk
op. Please have a try. https://github.com/grimoire/mmdeploy/tree/fix-coreml
Same environment from before, but using your patch I get the following stack trace:
2023-01-13 13:35:36,529 - mmdeploy - INFO - Save PyTorch model: /Users/typically/Workspace/vbti/PlantMorphology/tools/deploy/end2end.pt.
2023-01-13 13:35:36,620 - mmdeploy - INFO - Finish pipeline mmdeploy.apis.pytorch2torchscript.torch2torchscript
2023-01-13 13:35:36,759 - mmdeploy - INFO - Start pipeline mmdeploy.apis.utils.utils.to_backend in main process
Traceback (most recent call last):
File "libs/mmdeploy/tools/deploy.py", line 308, in <module>
main()
File "libs/mmdeploy/tools/deploy.py", line 232, in main
backend_files = to_backend(
File "/Users/typically/Workspace/vbti/PlantMorphology/tools/deploy/libs/mmdeploy/mmdeploy/apis/core/pipeline_manager.py", line 356, in _wrap
return self.call_function(func_name_, *args, **kwargs)
File "/Users/typically/Workspace/vbti/PlantMorphology/tools/deploy/libs/mmdeploy/mmdeploy/apis/core/pipeline_manager.py", line 326, in call_function
return self.call_function_local(func_name, *args, **kwargs)
File "/Users/typically/Workspace/vbti/PlantMorphology/tools/deploy/libs/mmdeploy/mmdeploy/apis/core/pipeline_manager.py", line 275, in call_function_local
return pipe_caller(*args, **kwargs)
File "/Users/typically/Workspace/vbti/PlantMorphology/tools/deploy/libs/mmdeploy/mmdeploy/apis/core/pipeline_manager.py", line 107, in __call__
ret = func(*args, **kwargs)
File "/Users/typically/Workspace/vbti/PlantMorphology/tools/deploy/libs/mmdeploy/mmdeploy/apis/utils/utils.py", line 95, in to_backend
return backend_mgr.to_backend(
File "/Users/typically/Workspace/vbti/PlantMorphology/tools/deploy/libs/mmdeploy/mmdeploy/backend/coreml/backend_manager.py", line 83, in to_backend
from .torchscript2coreml import from_torchscript, get_model_suffix
File "/Users/typically/Workspace/vbti/PlantMorphology/tools/deploy/libs/mmdeploy/mmdeploy/backend/coreml/__init__.py", line 13, in <module>
from .torchscript2coreml import get_model_suffix
File "/Users/typically/Workspace/vbti/PlantMorphology/tools/deploy/libs/mmdeploy/mmdeploy/backend/coreml/torchscript2coreml.py", line 52, in <module>
input_names: list[str],
TypeError: 'type' object is not subscriptable
@Typiqally updated please try again.
Thank you @grimoire, it works now. I haven't tested the model completely, but the visualization from the deployment shows that it is working as expected.
Hi @grimoire I would be happy to see your check_env.py
output because the export to Core ML is still not working for me (using the same config and checkpoint), caused by a coremltools error.
I'm running everything on Google Colab (link)
!pip install coremltools
!pip install opencv-python
!pip3 install openmim
!mim install mmcv-full
# clone mmdeploy to get the deployment config. `--recursive` is not necessary
!git clone https://github.com/open-mmlab/mmdeploy.git
%cd mmdeploy
!pip install -v -e .
%cd ..
# clone mmdetection repo. We have to use the config file to build PyTorch nn module
!git clone https://github.com/open-mmlab/mmdetection.git
%cd mmdetection
!pip install -v -e .
%cd ..
# download checkpoint
!wget -P checkpoints https://download.openmmlab.com/mmdetection/v2.0/retinanet/retinanet_r18_fpn_1x_coco/retinanet_r18_fpn_1x_coco_20220407_171055-614fd399.pth
# run the command to start model conversion
!python mmdeploy/tools/deploy.py \
mmdeploy/configs/mmdet/detection/detection_coreml_static-800x1344.py \
mmdetection/configs/retinanet/retinanet_r18_fpn_1x_coco.py \
checkpoints/retinanet_r18_fpn_1x_coco_20220407_171055-614fd399.pth \
mmdetection/demo/demo.jpg \
--work-dir mmdeploy_model/retina \
--device cpu \
--dump-info
Error
Traceback (most recent call last):
File "mmdeploy/tools/deploy.py", line 308, in <module>
main()
File "mmdeploy/tools/deploy.py", line 129, in main
export2SDK(
File "/content/mmdeploy/mmdeploy/backend/sdk/export_info.py", line 456, in export2SDK
deploy_info = get_deploy(deploy_cfg, model_cfg, work_dir, device)
File "/content/mmdeploy/mmdeploy/backend/sdk/export_info.py", line 376, in get_deploy
models = get_models(deploy_cfg, model_cfg, work_dir, device)
File "/content/mmdeploy/mmdeploy/backend/sdk/export_info.py", line 148, in get_models
from mmdeploy.backend.coreml import get_model_suffix
File "/content/mmdeploy/mmdeploy/backend/coreml/__init__.py", line 12, in <module>
from . import ops
File "/content/mmdeploy/mmdeploy/backend/coreml/ops.py", line 28, in <module>
def log2(context, node):
File "/usr/local/lib/python3.8/dist-packages/coremltools/converters/mil/frontend/torch/torch_op_registry.py", line 58, in register_torch_op
return func_wrapper(_func)
File "/usr/local/lib/python3.8/dist-packages/coremltools/converters/mil/frontend/torch/torch_op_registry.py", line 42, in func_wrapper
raise ValueError("Torch op {} already registered.".format(f_name))
ValueError: Torch op log2 already registered.
check_env
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
/usr/local/lib/python3.8/dist-packages/mmcv/__init__.py:20: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
warnings.warn(
2023-02-04 16:44:15,345 - mmdeploy - INFO -
2023-02-04 16:44:15,345 - mmdeploy - INFO - **********Environmental information**********
fatal: not a git repository (or any of the parent directories): .git
2023-02-04 16:44:15,711 - mmdeploy - INFO - sys.platform: linux
2023-02-04 16:44:15,712 - mmdeploy - INFO - Python: 3.8.10 (default, Nov 14 2022, 12:59:47) [GCC 9.4.0]
2023-02-04 16:44:15,712 - mmdeploy - INFO - CUDA available: False
2023-02-04 16:44:15,712 - mmdeploy - INFO - GCC: x86_64-linux-gnu-gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
2023-02-04 16:44:15,712 - mmdeploy - INFO - PyTorch: 1.13.1+cu116
2023-02-04 16:44:15,712 - mmdeploy - INFO - PyTorch compiling details: PyTorch built with:
- GCC 9.3
- C++ Version: 201402
- Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
- Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
- OpenMP 201511 (a.k.a. OpenMP 4.5)
- LAPACK is enabled (usually provided by MKL)
- NNPACK is enabled
- CPU capability usage: AVX2
- Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.6, CUDNN_VERSION=8.3.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.13.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,
2023-02-04 16:44:15,712 - mmdeploy - INFO - TorchVision: 0.14.1+cu116
2023-02-04 16:44:15,712 - mmdeploy - INFO - OpenCV: 4.6.0
2023-02-04 16:44:15,712 - mmdeploy - INFO - MMCV: 1.7.1
2023-02-04 16:44:15,712 - mmdeploy - INFO - MMCV Compiler: GCC 9.3
2023-02-04 16:44:15,712 - mmdeploy - INFO - MMCV CUDA Compiler: 11.6
2023-02-04 16:44:15,712 - mmdeploy - INFO - MMDeploy: 0.12.0+
2023-02-04 16:44:15,712 - mmdeploy - INFO -
2023-02-04 16:44:15,712 - mmdeploy - INFO - **********Backend information**********
2023-02-04 16:44:15,726 - mmdeploy - INFO - tensorrt: None
2023-02-04 16:44:15,729 - mmdeploy - INFO - ONNXRuntime: None
2023-02-04 16:44:15,730 - mmdeploy - INFO - pplnn: None
2023-02-04 16:44:15,734 - mmdeploy - INFO - ncnn: None
2023-02-04 16:44:15,738 - mmdeploy - INFO - snpe: None
2023-02-04 16:44:15,740 - mmdeploy - INFO - openvino: None
2023-02-04 16:44:15,745 - mmdeploy - INFO - torchscript: 1.13.1+cu116
2023-02-04 16:44:15,745 - mmdeploy - INFO - torchscript custom ops: NotAvailable
2023-02-04 16:44:15,875 - mmdeploy - INFO - rknn-toolkit: None
2023-02-04 16:44:15,875 - mmdeploy - INFO - rknn2-toolkit: None
2023-02-04 16:44:15,878 - mmdeploy - INFO - ascend: None
2023-02-04 16:44:19,423 - mmdeploy - INFO - coreml: 6.2
INFO:mmdeploy:coreml: 6.2
2023-02-04 16:44:19,426 - mmdeploy - INFO - tvm: None
INFO:mmdeploy:tvm: None
2023-02-04 16:44:19,426 - mmdeploy - INFO -
INFO:mmdeploy:
2023-02-04 16:44:19,426 - mmdeploy - INFO - **********Codebase information**********
INFO:mmdeploy:**********Codebase information**********
2023-02-04 16:44:19,428 - mmdeploy - INFO - mmdet: 2.28.1
INFO:mmdeploy:mmdet: 2.28.1
2023-02-04 16:44:19,428 - mmdeploy - INFO - mmseg: None
INFO:mmdeploy:mmseg: None
2023-02-04 16:44:19,428 - mmdeploy - INFO - mmcls: None
INFO:mmdeploy:mmcls: None
2023-02-04 16:44:19,428 - mmdeploy - INFO - mmocr: None
INFO:mmdeploy:mmocr: None
2023-02-04 16:44:19,428 - mmdeploy - INFO - mmedit: None
INFO:mmdeploy:mmedit: None
2023-02-04 16:44:19,428 - mmdeploy - INFO - mmdet3d: None
INFO:mmdeploy:mmdet3d: None
2023-02-04 16:44:19,428 - mmdeploy - INFO - mmpose: None
INFO:mmdeploy:mmpose: None
2023-02-04 16:44:19,428 - mmdeploy - INFO - mmrotate: None
INFO:mmdeploy:mmrotate: None
2023-02-04 16:44:19,428 - mmdeploy - INFO - mmaction: None
INFO:mmdeploy:mmaction: None
@JohannesBauer97 Try comment the function below. https://github.com/open-mmlab/mmdeploy/blob/b85f34141b61ad0d70897cc6dcfef38928b673fb/mmdeploy/backend/coreml/ops.py#L28 .
// Update: I tried with: pytorch==1.13.1 torchvision=0.14.1 mmcv-full==1.7.1 coremltools==6.2
and
pytorch==1.12.1 torchvision=0.13.1 mmcv-full==1.7.0 coremltools==6.1
And got the same error messages which I posted in the original comment below. @grimoire could you try once to convert a model to coreml and if it works send the check_env output to compare our environments ?
// Original:
@grimoire Then I receive a new similar error: Torch op coreml_nms already registered
.
When commenting out the nms op, another issue is raised (see below).
I'll try to downgrade coremltools, it might be an issue with yesterday released version...
https://github.com/apple/coremltools/releases/tag/6.2
Traceback (most recent call last):
File "mmdeploy/tools/deploy.py", line 308, in <module>
main()
File "mmdeploy/tools/deploy.py", line 232, in main
backend_files = to_backend(
File "/Users/joba/Documents/Data Science/mmlab/mmdeploy/mmdeploy/apis/core/pipeline_manager.py", line 356, in _wrap
return self.call_function(func_name_, *args, **kwargs)
File "/Users/joba/Documents/Data Science/mmlab/mmdeploy/mmdeploy/apis/core/pipeline_manager.py", line 326, in call_function
return self.call_function_local(func_name, *args, **kwargs)
File "/Users/joba/Documents/Data Science/mmlab/mmdeploy/mmdeploy/apis/core/pipeline_manager.py", line 275, in call_function_local
return pipe_caller(*args, **kwargs)
File "/Users/joba/Documents/Data Science/mmlab/mmdeploy/mmdeploy/apis/core/pipeline_manager.py", line 107, in __call__
ret = func(*args, **kwargs)
File "/Users/joba/Documents/Data Science/mmlab/mmdeploy/mmdeploy/apis/utils/utils.py", line 95, in to_backend
return backend_mgr.to_backend(
File "/Users/joba/Documents/Data Science/mmlab/mmdeploy/mmdeploy/backend/coreml/backend_manager.py", line 109, in to_backend
from_torchscript(
File "/Users/joba/Documents/Data Science/mmlab/mmdeploy/mmdeploy/backend/coreml/torchscript2coreml.py", line 95, in from_torchscript
mlmodel = ct.convert(
File "/Users/joba/miniforge3/envs/mmlab/lib/python3.8/site-packages/coremltools/converters/_converters_entry.py", line 444, in convert
mlmodel = mil_convert(
File "/Users/joba/miniforge3/envs/mmlab/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 187, in mil_convert
return _mil_convert(model, convert_from, convert_to, ConverterRegistry, MLModel, compute_units, **kwargs)
File "/Users/joba/miniforge3/envs/mmlab/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 211, in _mil_convert
proto, mil_program = mil_convert_to_proto(
File "/Users/joba/miniforge3/envs/mmlab/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 281, in mil_convert_to_proto
prog = frontend_converter(model, **kwargs)
File "/Users/joba/miniforge3/envs/mmlab/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 109, in __call__
return load(*args, **kwargs)
File "/Users/joba/miniforge3/envs/mmlab/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/load.py", line 57, in load
return _perform_torch_convert(converter, debug)
File "/Users/joba/miniforge3/envs/mmlab/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/load.py", line 104, in _perform_torch_convert
raise e
File "/Users/joba/miniforge3/envs/mmlab/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/load.py", line 96, in _perform_torch_convert
prog = converter.convert()
File "/Users/joba/miniforge3/envs/mmlab/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/converter.py", line 281, in convert
convert_nodes(self.context, self.graph)
File "/Users/joba/miniforge3/envs/mmlab/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/ops.py", line 84, in convert_nodes
raise RuntimeError(
RuntimeError: PyTorch convert function for op 'mmdeploy::coreml_nms' not implemented.
2023-02-06 10:41:16,426 - mmdeploy - INFO - **********Backend information**********
2023-02-06 10:41:16,438 - mmdeploy - INFO - tensorrt: None
2023-02-06 10:41:16,440 - mmdeploy - INFO - ONNXRuntime: None
2023-02-06 10:41:16,441 - mmdeploy - INFO - pplnn: None
2023-02-06 10:41:16,445 - mmdeploy - INFO - ncnn: None
2023-02-06 10:41:16,448 - mmdeploy - INFO - snpe: None
2023-02-06 10:41:16,449 - mmdeploy - INFO - openvino: None
2023-02-06 10:41:16,452 - mmdeploy - INFO - torchscript: 1.10.2
2023-02-06 10:41:16,452 - mmdeploy - INFO - torchscript custom ops: Available
2023-02-06 10:41:16,480 - mmdeploy - INFO - rknn-toolkit: None
2023-02-06 10:41:16,480 - mmdeploy - INFO - rknn2-toolkit: None
2023-02-06 10:41:16,483 - mmdeploy - INFO - ascend: None
2023-02-06 10:41:17,036 - mmdeploy - INFO - coreml: 6.0b1
2023-02-06 10:41:18,016 - mmdeploy - INFO - tvm: 0.10.dev714+gd4bf9ecf5=
The log2 converter is added by ... me in coreml. It should be ignored in the latest vesion. We will fix it.
coreml_nms
is a PyTorch Custom op which did nothing except mapping the nms in coreml to the one used in mmdetection. I guess the latest coreml register an op with the same name. You can rename the op in
https://github.com/open-mmlab/mmdeploy/blob/b85f34141b61ad0d70897cc6dcfef38928b673fb/csrc/mmdeploy/backend_ops/torchscript/ops/coreml_nms/coreml_nms_cpu.cpp#L30
https://github.com/open-mmlab/mmdeploy/blob/b85f34141b61ad0d70897cc6dcfef38928b673fb/csrc/mmdeploy/backend_ops/torchscript/ops/bind.cpp#L11
https://github.com/open-mmlab/mmdeploy/blob/b85f34141b61ad0d70897cc6dcfef38928b673fb/mmdeploy/codebase/mmdet/core/post_processing/bbox_nms.py#L310
and
https://github.com/open-mmlab/mmdeploy/blob/b85f34141b61ad0d70897cc6dcfef38928b673fb/mmdeploy/backend/coreml/ops.py#L9
See if the conversion works. Or just downgrade the coreml.
I'll give it a try as soon as I get the time for it, I guess within this week. And probably I'll create a separate issue to not blow up your PR here (but link each other).
Thanks so far
Checklist
Describe the bug
I am attempting to update MMDeploy from version 0.10.0 to the latest version 0.12.0. However, this causes the Core ML conversion pipeline to break giving an unknown error (see stack trace section). I'm using exactly the same dependencies that I've used in version 0.10.0, which worked perfectly.
I've also tested version 0.11.0, and can conclude that everything after version 0.10.0 breaks the Core ML conversion pipeline. Furthermore, I'm not exactly sure which commit caused this issue, but I believe the breaking change is somewhere between version 0.10.0 and 0.11.0.
Interesting to note is that the check_env.py script does not show that Core ML is available, even though the Core ML tools package is installed and functional.
Reproduction
Environment
Error traceback