open-mmlab / mmcv

OpenMMLab Computer Vision Foundation
https://mmcv.readthedocs.io/en/latest/
Apache License 2.0
5.76k stars 1.61k forks source link

sagemaker serverless inference with mmdet #1925

Open lukqw opened 2 years ago

lukqw commented 2 years ago

Hi there, I am trying to deploy my mmdetection model on aws serverless sagemaker and am running into the following issues.

The deep learning container I am using has torch 1.10.2 and only offers a cpu.

Therefore I tried to install mmdet and mmcv over a requirements.txt, which resulted in the following error:

2022-04-28T14:53:18.706+02:00 | 2022-04-28T12:53:18,705 [INFO ] W-9003-model_1.0-stdout MODEL_LOG - File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
-- | --
  | 2022-04-28T14:53:18.706+02:00 | 2022-04-28T12:53:18,705 [INFO ] W-9003-model_1.0-stdout MODEL_LOG - File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  | 2022-04-28T14:53:18.706+02:00 | 2022-04-28T12:53:18,705 [INFO ] W-9003-model_1.0-stdout MODEL_LOG - File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  | 2022-04-28T14:53:18.706+02:00 | 2022-04-28T12:53:18,705 [INFO ] W-9003-model_1.0-stdout MODEL_LOG - File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  | 2022-04-28T14:53:18.706+02:00 | 2022-04-28T12:53:18,705 [INFO ] W-9003-model_1.0-stdout MODEL_LOG - File "<frozen importlib._bootstrap_external>", line 848, in exec_module
  | 2022-04-28T14:53:18.706+02:00 | 2022-04-28T12:53:18,705 [INFO ] W-9003-model_1.0-stdout MODEL_LOG - File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  | 2022-04-28T14:53:18.706+02:00 | 2022-04-28T12:53:18,705 [INFO ] W-9003-model_1.0-stdout MODEL_LOG - File "/opt/ml/model/code/inference.py", line 13, in <module>
  | 2022-04-28T14:53:18.706+02:00 | 2022-04-28T12:53:18,705 [INFO ] W-9003-model_1.0-stdout MODEL_LOG - from mmdet.apis import inference_detector, init_detector
  | 2022-04-28T14:53:18.706+02:00 | 2022-04-28T12:53:18,705 [INFO ] W-9003-model_1.0-stdout MODEL_LOG - File "/home/sbx_user1051/.local/lib/python3.8/site-packages/mmdet/apis/__init__.py", line 2, in <module>
  | 2022-04-28T14:53:18.706+02:00 | 2022-04-28T12:53:18,705 [INFO ] W-9003-model_1.0-stdout MODEL_LOG - from .inference import (async_inference_detector, inference_detector,
  | 2022-04-28T14:53:18.706+02:00 | 2022-04-28T12:53:18,705 [INFO ] W-9003-model_1.0-stdout MODEL_LOG - File "/home/sbx_user1051/.local/lib/python3.8/site-packages/mmdet/apis/inference.py", line 8, in <module>
  | 2022-04-28T14:53:18.706+02:00 | 2022-04-28T12:53:18,705 [INFO ] W-9003-model_1.0-stdout MODEL_LOG - from mmcv.ops import RoIPool
  | 2022-04-28T14:53:18.706+02:00 | 2022-04-28T12:53:18,705 [INFO ] W-9003-model_1.0-stdout MODEL_LOG - File "/home/sbx_user1051/.local/lib/python3.8/site-packages/mmcv/ops/__init__.py", line 2, in <module>
  | 2022-04-28T14:53:18.706+02:00 | 2022-04-28T12:53:18,705 [INFO ] W-9003-model_1.0-stdout MODEL_LOG - from .active_rotated_filter import active_rotated_filter
  | 2022-04-28T14:53:18.706+02:00 | 2022-04-28T12:53:18,705 [INFO ] W-9003-model_1.0-stdout MODEL_LOG - File "/home/sbx_user1051/.local/lib/python3.8/site-packages/mmcv/ops/active_rotated_filter.py", line 8, in <module>
  | 2022-04-28T14:53:18.706+02:00 | 2022-04-28T12:53:18,705 [INFO ] W-9003-model_1.0-stdout MODEL_LOG - ext_module = ext_loader.load_ext(
  | 2022-04-28T14:53:18.706+02:00 | 2022-04-28T12:53:18,705 [INFO ] W-9003-model_1.0-stdout MODEL_LOG - File "/home/sbx_user1051/.local/lib/python3.8/site-packages/mmcv/utils/ext_loader.py", line 13, in load_ext
  | 2022-04-28T14:53:18.706+02:00 | 2022-04-28T12:53:18,705 [INFO ] W-9003-model_1.0-stdout MODEL_LOG - ext = importlib.import_module('mmcv.' + name)
  | 2022-04-28T14:53:18.706+02:00 | 2022-04-28T12:53:18,705 [INFO ] W-9003-model_1.0-stdout MODEL_LOG - File "/opt/conda/lib/python3.8/importlib/__init__.py", line 127, in import_module
  | 2022-04-28T14:53:18.706+02:00 | 2022-04-28T12:53:18,705 [INFO ] W-9003-model_1.0-stdout MODEL_LOG - return _bootstrap._gcd_import(name[level:], package, level)
  | 2022-04-28T14:53:18.706+02:00 | 2022-04-28T12:53:18,705 [INFO ] W-9003-model_1.0-stdout MODEL_LOG - ModuleNotFoundError: No module named 'mmcv._ext'

Because of that I tried to install mmcv-full from the following mirror:

https://download.openmmlab.com/mmcv/dist/cpu/torch1.10.0/index.html

Then I am getting the following error:

2022-04-28T12:54:34.088+02:00 | 2022-04-28T10:54:34,088 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
-- | --
  | 2022-04-28T12:54:34.088+02:00 | 2022-04-28T10:54:34,088 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  | 2022-04-28T12:54:34.088+02:00 | 2022-04-28T10:54:34,088 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  | 2022-04-28T12:54:34.088+02:00 | 2022-04-28T10:54:34,088 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  | 2022-04-28T12:54:34.088+02:00 | 2022-04-28T10:54:34,088 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "<frozen importlib._bootstrap_external>", line 848, in exec_module
  | 2022-04-28T12:54:34.088+02:00 | 2022-04-28T10:54:34,088 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  | 2022-04-28T12:54:34.088+02:00 | 2022-04-28T10:54:34,088 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "/opt/ml/model/code/inference.py", line 13, in <module>
  | 2022-04-28T12:54:34.088+02:00 | 2022-04-28T10:54:34,088 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - from mmdet.apis import inference_detector, init_detector
  | 2022-04-28T12:54:34.088+02:00 | 2022-04-28T10:54:34,088 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "/home/sbx_user1051/.local/lib/python3.8/site-packages/mmdet/apis/__init__.py", line 2, in <module>
  | 2022-04-28T12:54:34.088+02:00 | 2022-04-28T10:54:34,088 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - from .inference import (async_inference_detector, inference_detector,
  | 2022-04-28T12:54:34.088+02:00 | 2022-04-28T10:54:34,088 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "/home/sbx_user1051/.local/lib/python3.8/site-packages/mmdet/apis/inference.py", line 8, in <module>
  | 2022-04-28T12:54:34.088+02:00 | 2022-04-28T10:54:34,088 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - from mmcv.ops import RoIPool
  | 2022-04-28T12:54:34.088+02:00 | 2022-04-28T10:54:34,088 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "/home/sbx_user1051/.local/lib/python3.8/site-packages/mmcv/ops/__init__.py", line 2, in <module>
  | 2022-04-28T12:54:34.089+02:00 | 2022-04-28T10:54:34,088 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - from .active_rotated_filter import active_rotated_filter
  | 2022-04-28T12:54:34.089+02:00 | 2022-04-28T10:54:34,088 [INFO ] epollEventLoopGroup-5-4 org.pytorch.serve.wlm.WorkerThread - 9002 Worker disconnected. WORKER_STARTED
  | 2022-04-28T12:54:34.089+02:00 | 2022-04-28T10:54:34,088 [WARN ] W-9002-model_1.0 org.pytorch.serve.wlm.BatchAggregator - Load model failed: model, error: Worker died.
  | 2022-04-28T12:54:34.089+02:00 | 2022-04-28T10:54:34,088 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "/home/sbx_user1051/.local/lib/python3.8/site-packages/mmcv/ops/active_rotated_filter.py", line 8, in <module>
  | 2022-04-28T12:54:34.089+02:00 | 2022-04-28T10:54:34,088 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - ext_module = ext_loader.load_ext(
  | 2022-04-28T12:54:34.089+02:00 | 2022-04-28T10:54:34,088 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "/home/sbx_user1051/.local/lib/python3.8/site-packages/mmcv/utils/ext_loader.py", line 13, in load_ext
  | 2022-04-28T12:54:34.089+02:00 | 2022-04-28T10:54:34,088 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - ext = importlib.import_module('mmcv.' + name)
  | 2022-04-28T12:54:34.089+02:00 | 2022-04-28T10:54:34,088 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "/opt/conda/lib/python3.8/importlib/__init__.py", line 127, in import_module
  | 2022-04-28T12:54:34.089+02:00 | 2022-04-28T10:54:34,088 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - return _bootstrap._gcd_import(name[level:], package, level)
  | 2022-04-28T12:54:34.089+02:00Copy2022-04-28T10:54:34,088 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - ImportError: /home/sbx_user1051/.local/lib/python3.8/site-packages/mmcv/_ext.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC2ENS_14SourceLocationESs 
| 2022-04-28T10:54:34,088 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - ImportError: /home/sbx_user1051/.local/lib/python3.8/site-packages/mmcv/_ext.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC2ENS_14SourceLocationESs

In essence, the error is the following:

ImportError: /home/sbx_user1051/.local/lib/python3.8/site-packages/mmcv/_ext.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC2ENS_14SourceLocationESs

Does anyone know how to fix it?

zhouzaida commented 2 years ago

Hi, what command did you use to install mmcv-full?

lukqw commented 2 years ago

I have the deps in a requirements.txt which gets auto installed by sagemaker


--find-links https://download.openmmlab.com/mmcv/dist/cpu/torch1.10.0/index.html
mmcv-full

mmdet==2.24

Logs of install

2022-04-28T15:34:01.087+02:00 | Looking in links: https://download.openmmlab.com/mmcv/dist/cpu/torch1.10.0/index.html
-- | --
  | 2022-04-28T15:34:01.851+02:00 | Collecting mmcv-full
  | 2022-04-28T15:34:02.169+02:00 | Downloading https://download.openmmlab.com/mmcv/dist/cpu/torch1.10.0/mmcv_full-1.5.0-cp38-cp38-manylinux1_x86_64.whl (30.9 MB)
  | 2022-04-28T15:34:04.876+02:00 | ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 30.9/30.9 MB 10.5 MB/s eta 0:00:00
  | 2022-04-28T15:34:05.186+02:00 | Collecting mmdet==2.24
  | 2022-04-28T15:34:05.211+02:00 | Downloading mmdet-2.24.0-py3-none-any.whl (1.4 MB)
  | 2022-04-28T15:34:05.232+02:00 | ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.4/1.4 MB 75.3 MB/s eta 0:00:00
zhouzaida commented 2 years ago

Could you provide the output of pip list | grep mmcv? Probably you installed both mmcv and mmcv-full so you could uninstall them and install mmcv-full again.

lukqw commented 2 years ago

Sadly I can not provide the output of said command. I am working with a sagemaker instance where I am not able to execute such commands AFAIK. The packages of requirements.txt are installed fresh on each deployment. Therefore you can consider below logs from a new virtual environment, which rules out that mmcv is already installed.

I digged deeper and can provide full installation logs:

2022-04-28T15:34:00.842+02:00 | Defaulting to user installation because normal site-packages is not writeable
-- | --
  | 2022-04-28T15:34:01.087+02:00 | Looking in links: https://download.openmmlab.com/mmcv/dist/cpu/torch1.10.0/index.html
  | 2022-04-28T15:34:01.851+02:00 | Collecting mmcv-full
  | 2022-04-28T15:34:02.169+02:00 | Downloading https://download.openmmlab.com/mmcv/dist/cpu/torch1.10.0/mmcv_full-1.5.0-cp38-cp38-manylinux1_x86_64.whl (30.9 MB)
  | 2022-04-28T15:34:04.876+02:00 | ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 30.9/30.9 MB 10.5 MB/s eta 0:00:00
  | 2022-04-28T15:34:05.186+02:00 | Collecting mmdet==2.24
  | 2022-04-28T15:34:05.211+02:00 | Downloading mmdet-2.24.0-py3-none-any.whl (1.4 MB)
  | 2022-04-28T15:34:05.232+02:00 | ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.4/1.4 MB 75.3 MB/s eta 0:00:00
  | 2022-04-28T15:34:05.288+02:00 | Requirement already satisfied: matplotlib in /opt/conda/lib/python3.8/site-packages (from mmdet==2.24->-r /opt/ml/model/code/requirements.txt (line 10)) (3.5.1)
  | 2022-04-28T15:34:05.288+02:00 | Requirement already satisfied: six in /opt/conda/lib/python3.8/site-packages (from mmdet==2.24->-r /opt/ml/model/code/requirements.txt (line 10)) (1.16.0)
  | 2022-04-28T15:34:05.546+02:00 | Collecting terminaltables
  | 2022-04-28T15:34:05.551+02:00 | Downloading terminaltables-3.1.10-py2.py3-none-any.whl (15 kB)
  | 2022-04-28T15:34:05.555+02:00 | Requirement already satisfied: numpy in /opt/conda/lib/python3.8/site-packages (from mmdet==2.24->-r /opt/ml/model/code/requirements.txt (line 10)) (1.22.2)
  | 2022-04-28T15:34:05.820+02:00 | Collecting pycocotools
  | 2022-04-28T15:34:05.827+02:00 | Downloading pycocotools-2.0.4.tar.gz (106 kB)
  | 2022-04-28T15:34:05.832+02:00 | ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 106.6/106.6 KB 47.8 MB/s eta 0:00:00
  | 2022-04-28T15:34:06.050+02:00 | Installing build dependencies: started
  | 2022-04-28T15:34:15.083+02:00 | Installing build dependencies: finished with status 'done'
  | 2022-04-28T15:34:15.087+02:00 | Getting requirements to build wheel: started
  | 2022-04-28T15:34:16.652+02:00 | Getting requirements to build wheel: finished with status 'done'
  | 2022-04-28T15:34:16.656+02:00 | Preparing metadata (pyproject.toml): started
  | 2022-04-28T15:34:18.391+02:00 | Preparing metadata (pyproject.toml): finished with status 'done'
  | 2022-04-28T15:34:18.428+02:00 | Requirement already satisfied: Pillow in /opt/conda/lib/python3.8/site-packages (from mmcv-full->-r /opt/ml/model/code/requirements.txt (line 8)) (9.1.0)
  | 2022-04-28T15:34:18.429+02:00 | Requirement already satisfied: packaging in /opt/conda/lib/python3.8/site-packages (from mmcv-full->-r /opt/ml/model/code/requirements.txt (line 8)) (20.4)
  | 2022-04-28T15:34:18.943+02:00 | Collecting opencv-python>=3
  | 2022-04-28T15:34:18.948+02:00 | Downloading opencv_python-4.5.5.64-cp36-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (60.5 MB)
  | 2022-04-28T15:34:19.548+02:00 | ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 60.5/60.5 MB 34.3 MB/s eta 0:00:00
  | 2022-04-28T15:34:20.027+02:00 | Collecting yapf
  | 2022-04-28T15:34:20.034+02:00 | Downloading yapf-0.32.0-py2.py3-none-any.whl (190 kB)
  | 2022-04-28T15:34:20.039+02:00 | ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 190.2/190.2 KB 44.2 MB/s eta 0:00:00
  | 2022-04-28T15:34:20.293+02:00 | Collecting addict
  | 2022-04-28T15:34:20.298+02:00 | Downloading addict-2.4.0-py3-none-any.whl (3.8 kB)
  | 2022-04-28T15:34:20.301+02:00 | Requirement already satisfied: pyyaml in /opt/conda/lib/python3.8/site-packages (from mmcv-full->-r /opt/ml/model/code/requirements.txt (line 8)) (5.4.1)
  | 2022-04-28T15:34:20.324+02:00 | Requirement already satisfied: cycler>=0.10 in /opt/conda/lib/python3.8/site-packages (from matplotlib->mmdet==2.24->-r /opt/ml/model/code/requirements.txt (line 10)) (0.11.0)
  | 2022-04-28T15:34:20.325+02:00 | Requirement already satisfied: python-dateutil>=2.7 in /opt/conda/lib/python3.8/site-packages (from matplotlib->mmdet==2.24->-r /opt/ml/model/code/requirements.txt (line 10)) (2.8.2)
  | 2022-04-28T15:34:20.325+02:00 | Requirement already satisfied: fonttools>=4.22.0 in /opt/conda/lib/python3.8/site-packages (from matplotlib->mmdet==2.24->-r /opt/ml/model/code/requirements.txt (line 10)) (4.31.2)
  | 2022-04-28T15:34:20.326+02:00 | Requirement already satisfied: pyparsing>=2.2.1 in /opt/conda/lib/python3.8/site-packages (from matplotlib->mmdet==2.24->-r /opt/ml/model/code/requirements.txt (line 10)) (3.0.7)
  | 2022-04-28T15:34:20.327+02:00 | Requirement already satisfied: kiwisolver>=1.0.1 in /opt/conda/lib/python3.8/site-packages (from matplotlib->mmdet==2.24->-r /opt/ml/model/code/requirements.txt (line 10)) (1.4.2)
  | 2022-04-28T15:34:20.431+02:00 | Building wheels for collected packages: pycocotools
  | 2022-04-28T15:34:20.432+02:00 | Building wheel for pycocotools (pyproject.toml): started
  | 2022-04-28T15:34:28.218+02:00 | Building wheel for pycocotools (pyproject.toml): finished with status 'done'
  | 2022-04-28T15:34:28.220+02:00 | Created wheel for pycocotools: filename=pycocotools-2.0.4-cp38-cp38-linux_x86_64.whl size=422909 sha256=555464b8c9ecb5a9c9e703ca88a5e8c0becd9b70c9673e00e2d8314be38cdf36
  | 2022-04-28T15:34:28.221+02:00 | Stored in directory: /home/sbx_user1051/.cache/pip/wheels/dd/e2/43/3e93cd653b3346b3d702bb0509bc611189f95d60407bff1484
  | 2022-04-28T15:34:28.223+02:00 | Successfully built pycocotools
  | 2022-04-28T15:34:28.615+02:00 | Installing collected packages: yapf, addict, terminaltables, opencv-python, mmcv-full, pycocotools, mmdet

What I can infer from the last line, is that only mmcv-full is installed.

I further include more error logging:

2022-04-28T15:36:25.305+02:00 | 2022-04-28T13:36:25,305 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - Backend worker process died.
-- | --
  | 2022-04-28T15:36:25.305+02:00 | 2022-04-28T13:36:25,305 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - Traceback (most recent call last):
  | 2022-04-28T15:36:25.305+02:00 | 2022-04-28T13:36:25,305 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "/opt/conda/lib/python3.8/site-packages/ts/model_service_worker.py", line 189, in <module>
  | 2022-04-28T15:36:25.306+02:00 | 2022-04-28T13:36:25,305 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - worker.run_server()
  | 2022-04-28T15:36:25.306+02:00 | 2022-04-28T13:36:25,306 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "/opt/conda/lib/python3.8/site-packages/ts/model_service_worker.py", line 161, in run_server
  | 2022-04-28T15:36:25.306+02:00 | 2022-04-28T13:36:25,306 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - self.handle_connection(cl_socket)
  | 2022-04-28T15:36:25.306+02:00 | 2022-04-28T13:36:25,306 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "/opt/conda/lib/python3.8/site-packages/ts/model_service_worker.py", line 123, in handle_connection
  | 2022-04-28T15:36:25.306+02:00 | 2022-04-28T13:36:25,306 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - service, result, code = self.load_model(msg)
  | 2022-04-28T15:36:25.306+02:00 | 2022-04-28T13:36:25,306 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "/opt/conda/lib/python3.8/site-packages/ts/model_service_worker.py", line 95, in load_model
  | 2022-04-28T15:36:25.306+02:00 | 2022-04-28T13:36:25,306 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - service = model_loader.load(model_name, model_dir, handler, gpu,
  | 2022-04-28T15:36:25.306+02:00 | 2022-04-28T13:36:25,306 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "/opt/conda/lib/python3.8/site-packages/ts/model_loader.py", line 112, in load
  | 2022-04-28T15:36:25.306+02:00 | 2022-04-28T13:36:25,306 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - initialize_fn(service.context)
  | 2022-04-28T15:36:25.306+02:00 | 2022-04-28T13:36:25,306 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "/opt/conda/lib/python3.8/site-packages/sagemaker_pytorch_serving_container/handler_service.py", line 51, in initialize
  | 2022-04-28T15:36:25.306+02:00 | 2022-04-28T13:36:25,306 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - super().initialize(context)
  | 2022-04-28T15:36:25.306+02:00 | 2022-04-28T13:36:25,306 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "/opt/conda/lib/python3.8/site-packages/sagemaker_inference/default_handler_service.py", line 66, in initialize
  | 2022-04-28T15:36:25.306+02:00 | 2022-04-28T13:36:25,306 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - self._service.validate_and_initialize(model_dir=model_dir)
  | 2022-04-28T15:36:25.306+02:00 | 2022-04-28T13:36:25,306 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "/opt/conda/lib/python3.8/site-packages/sagemaker_inference/transformer.py", line 157, in validate_and_initialize
  | 2022-04-28T15:36:25.306+02:00 | 2022-04-28T13:36:25,306 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - self._validate_user_module_and_set_functions()
  | 2022-04-28T15:36:25.306+02:00 | 2022-04-28T13:36:25,306 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "/opt/conda/lib/python3.8/site-packages/sagemaker_inference/transformer.py", line 170, in _validate_user_module_and_set_functions
  | 2022-04-28T15:36:25.306+02:00 | 2022-04-28T13:36:25,306 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - user_module = importlib.import_module(user_module_name)
  | 2022-04-28T15:36:25.306+02:00 | 2022-04-28T13:36:25,306 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "/opt/conda/lib/python3.8/importlib/__init__.py", line 127, in import_module
  | 2022-04-28T15:36:25.307+02:00 | 2022-04-28T13:36:25,306 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - return _bootstrap._gcd_import(name[level:], package, level)
  | 2022-04-28T15:36:25.307+02:00 | 2022-04-28T13:36:25,307 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  | 2022-04-28T15:36:25.307+02:00 | 2022-04-28T13:36:25,307 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  | 2022-04-28T15:36:25.307+02:00 | 2022-04-28T13:36:25,307 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  | 2022-04-28T15:36:25.307+02:00 | 2022-04-28T13:36:25,307 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  | 2022-04-28T15:36:25.307+02:00 | 2022-04-28T13:36:25,307 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "<frozen importlib._bootstrap_external>", line 848, in exec_module
  | 2022-04-28T15:36:25.307+02:00 | 2022-04-28T13:36:25,307 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  | 2022-04-28T15:36:25.307+02:00 | 2022-04-28T13:36:25,307 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "/opt/ml/model/code/inference.py", line 13, in <module>
  | 2022-04-28T15:36:25.307+02:00 | 2022-04-28T13:36:25,307 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - from mmdet.apis import inference_detector, init_detector
  | 2022-04-28T15:36:25.307+02:00 | 2022-04-28T13:36:25,307 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "/home/sbx_user1051/.local/lib/python3.8/site-packages/mmdet/apis/__init__.py", line 2, in <module>
  | 2022-04-28T15:36:25.307+02:00 | 2022-04-28T13:36:25,307 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - from .inference import (async_inference_detector, inference_detector,
  | 2022-04-28T15:36:25.307+02:00 | 2022-04-28T13:36:25,307 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "/home/sbx_user1051/.local/lib/python3.8/site-packages/mmdet/apis/inference.py", line 8, in <module>
  | 2022-04-28T15:36:25.307+02:00 | 2022-04-28T13:36:25,307 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - from mmcv.ops import RoIPool
  | 2022-04-28T15:36:25.307+02:00 | 2022-04-28T13:36:25,307 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "/home/sbx_user1051/.local/lib/python3.8/site-packages/mmcv/ops/__init__.py", line 2, in <module>
  | 2022-04-28T15:36:25.307+02:00 | 2022-04-28T13:36:25,307 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - from .active_rotated_filter import active_rotated_filter
  | 2022-04-28T15:36:25.307+02:00 | 2022-04-28T13:36:25,307 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "/home/sbx_user1051/.local/lib/python3.8/site-packages/mmcv/ops/active_rotated_filter.py", line 8, in <module>
  | 2022-04-28T15:36:25.307+02:00 | 2022-04-28T13:36:25,307 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - ext_module = ext_loader.load_ext(
  | 2022-04-28T15:36:25.307+02:00 | 2022-04-28T13:36:25,307 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "/home/sbx_user1051/.local/lib/python3.8/site-packages/mmcv/utils/ext_loader.py", line 13, in load_ext
  | 2022-04-28T15:36:25.308+02:00 | 2022-04-28T13:36:25,307 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - ext = importlib.import_module('mmcv.' + name)
  | 2022-04-28T15:36:25.308+02:00 | 2022-04-28T13:36:25,308 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - File "/opt/conda/lib/python3.8/importlib/__init__.py", line 127, in import_module
  | 2022-04-28T15:36:25.308+02:00 | 2022-04-28T13:36:25,308 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - return _bootstrap._gcd_import(name[level:], package, level)
  | 2022-04-28T15:36:25.308+02:00 | 2022-04-28T13:36:25,308 [INFO ] W-9002-model_1.0-stdout MODEL_LOG - ImportError: /home/sbx_user1051/.local/lib/python3.8/site-packages/mmcv/_ext.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC2ENS_14SourceLocationESs
payal211 commented 1 year ago

Hi @lukqw, @zhouzaida

Can you please provide steps how to deploy mmdetection on Sagemaker?

Thanks.

lukqw commented 1 year ago

I never managed to get it to work @payal211... seemed to me like some dependency was missing in the docker container.

payal211 commented 1 year ago

Thanks for the update @lukqw. Have you tested Pretrained model or make any inference script for sagemaker?

zhouzaida commented 1 year ago

Hi @payal211 and @lukqw, sorry for my late reply. Unfortunately, we don't have this environment currently, so we can't provide the steps. If you encounter any problems, you are welcome to provide error messages here.

heyitsguay commented 1 year ago

Well for what it's worth, I would also like to be able to run this in Sagemaker but am experiencing difficulty in getting a working installation. If anyone figure it out, would lvoe to hear how to do it!

yossibiton commented 1 year ago

I got the same error on my Ubuntu machine : undefined symbol: _ZN3c105ErrorC2ENS_14SourceLocationESs

When i'm trying to run this import : from mmcv import ops

I have installed mmcv with mim :

pip install -U openmim
mim install mmcv-full
payal211 commented 1 year ago

Hi @zhouzaida I managed to deploy it using Docker container and torch-serve. @lukqw if you weren't succeeded yet, Please look into the Dockerfile. But still its a bit tricky to deploy on SageMaker.