exadel-inc / CompreFace

Leading free and open-source face recognition system
https://exadel.com/accelerator-showcase/compreface/
Apache License 2.0
5.67k stars 773 forks source link

Version 1.0.0-arcface-r100: cudaMalloc failed: out of memory #783

Closed koldogut closed 2 years ago

koldogut commented 2 years ago

Describe the bug Nvidia GPU based docker image prompts an error on startup, webui works but detection doesn't. GPU is an NvidiaT400 with 2GB of RAM

To Reproduce Steps to reproduce the behavior:

  1. Install and configure docker image
  2. Add --runtime=nvidia as extra parameters
  3. Start container
  4. See error on logs

Expected behavior No CUDA error

Screenshots nvidia-smi command returns this image

the error found in the logs says it's out of memory but should have enough, Compreface is the only process using GPU resources

`{"severity": "CRITICAL", "message": "PluginError: insightface.Calculator@arcface-r100-msfdrop75 error - simple_bind error. Arguments:\ndata: (1, 3, 112, 112)\n[00:24:37] src/storage/./pooled_storage_manager.h:157: cudaMalloc failed: out of memory\nStack trace:\n [bt] (0) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(+0x4b04cb) [0x14b797b704cb]\n [bt] (1) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(+0x2e653eb) [0x14b79a5253eb]\n [bt] (2) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(+0x2e6af0f) [0x14b79a52af0f]\n [bt] (3) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(mxnet::NDArray::NDArray(mxnet::TShape const&, mxnet::Context, bool, int)+0x5d0) [0x14b799bf16d0]\n [bt] (4) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(mxnet::common::InitZeros(mxnet::NDArrayStorageType, mxnet::TShape const&, mxnet::Context const&, int)+0x5c) [0x14b799ca58ac]\n [bt] (5) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(mxnet::common::ReshapeOrCreate(std::string const&, mxnet::TShape const&, int, mxnet::NDArrayStorageType, mxnet::Context const&, std::unordered_map<std::string, mxnet::NDArray, std::hash, std::equal_to, std::allocator<std::pair<std::string const, mxnet::NDArray> > >, bool)+0x3a1) [0x14b799cb8f91]\n [bt] (6) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(mxnet::exec::GraphExecutor::InitArguments(nnvm::IndexedGraph const&, std::vector<mxnet::TShape, std::allocator > const&, std::vector<int, std::allocator > const&, std::vector<int, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::OpReqType, std::allocator > const&, std::unordered_set<std::string, std::hash, std::equal_to, std::allocator > const&, mxnet::Executor const, std::unordered_map<std::string, mxnet::NDArray, std::hash, std::equal_to, std::allocator<std::pair<std::string const, mxnet::NDArray> > >, std::vector<mxnet::NDArray, std::allocator >, std::vector<mxnet::NDArray, std::allocator >, std::vector<mxnet::NDArray, std::allocator >)+0xb10) [0x14b799cc0f60]\n [bt] (7) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(mxnet::exec::GraphExecutor::Init(nnvm::Symbol, mxnet::Context const&, std::map<std::string, mxnet::Context, std::less, std::allocator<std::pair<std::string const, mxnet::Context> > > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::unordered_map<std::string, mxnet::TShape, std::hash, std::equal_to, std::allocator<std::pair<std::string const, mxnet::TShape> > > const&, std::unordered_map<std::string, int, std::hash, std::equal_to, std::allocator<std::pair<std::string const, int> > > const&, std::unordered_map<std::string, int, std::hash, std::equal_to, std::allocator<std::pair<std::string const, int> > > const&, std::vector<mxnet::OpReqType, std::allocator > const&, std::unordered_set<std::string, std::hash, std::equal_to, std::allocator > const&, std::vector<mxnet::NDArray, std::allocator >, std::vector<mxnet::NDArray, std::allocator >, std::vector<mxnet::NDArray, std::allocator >, std::unordered_map<std::string, mxnet::NDArray, std::hash, std::equal_to, std::allocator<std::pair<std::string const, mxnet::NDArray> > >, mxnet::Executor, std::unordered_map<nnvm::NodeEntry, mxnet::NDArray, nnvm::NodeEntryHash, nnvm::NodeEntryEqual, std::allocator<std::pair<nnvm::NodeEntry const, mxnet::NDArray> > > const&)+0x6bc) [0x14b799ccf32c]\n [bt] (8) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(mxnet::Executor::SimpleBind(nnvm::Symbol, mxnet::Context const&, std::map<std::string, mxnet::Context, std::less, std::allocator<std::pair<std::string const, mxnet::Context> > > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::unordered_map<std::string, mxnet::TShape, std::hash, std::equal_to, std::allocator<std::pair<std::string const, mxnet::TShape> > > const&, std::unordered_map<std::string, int, std::hash, std::equal_to, std::allocator<std::pair<std::string const, int> > > const&, std::unordered_map<std::string, int, std::hash, std::equal_to, std::allocator<std::pair<std::string const, int> > > const&, std::vector<mxnet::OpReqType, std::allocator > const&, std::unordered_set<std::string, std::hash, std::equal_to, std::allocator > const&, std::vector<mxnet::NDArray, std::allocator >, std::vector<mxnet::NDArray, std::allocator >, std::vector<mxnet::NDArray, std::allocator >, std::unordered_map<std::string, mxnet::NDArray, std::hash, std::equal_to, std::allocator<std::pair<std::string const, mxnet::NDArray> > >, mxnet::Executor)+0x8a8) [0x14b799cd0258]\n\n", "request": {"method": "POST", "path": "/find_faces", "filename": "lenna.jpg", "api_key": "", "remoteaddr": "127.0.0.1"}, "logger": "src.services.flask.error_handling", "module": "error_handling", "traceback": "Traceback (most recent call last):\n File \"/usr/local/lib/python3.7/dist-packages/mxnet/symbol/symbol.py\", line 1623, in simple_bind\n ctypes.byref(exe_handle)))\n File \"/usr/local/lib/python3.7/dist-packages/mxnet/base.py\", line 253, in check_call\n raise MXNetError(py_str(_LIB.MXGetLastError()))\nmxnet.base.MXNetError: [00:24:37] src/storage/./pooled_storage_manager.h:157: cudaMalloc failed: out of memory\nStack trace:\n [bt] (0) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(+0x4b04cb) [0x14b797b704cb]\n [bt] (1) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(+0x2e653eb) [0x14b79a5253eb]\n [bt] (2) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(+0x2e6af0f) [0x14b79a52af0f]\n [bt] (3) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(mxnet::NDArray::NDArray(mxnet::TShape const&, mxnet::Context, bool, int)+0x5d0) [0x14b799bf16d0]\n [bt] (4) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(mxnet::common::InitZeros(mxnet::NDArrayStorageType, mxnet::TShape const&, mxnet::Context const&, int)+0x5c) [0x14b799ca58ac]\n [bt] (5) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(mxnet::common::ReshapeOrCreate(std::string const&, mxnet::TShape const&, int, mxnet::NDArrayStorageType, mxnet::Context const&, std::unordered_map<std::string, mxnet::NDArray, std::hash, std::equal_to, std::allocator<std::pair<std::string const, mxnet::NDArray> > >, bool)+0x3a1) [0x14b799cb8f91]\n [bt] (6) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(mxnet::exec::GraphExecutor::InitArguments(nnvm::IndexedGraph const&, std::vector<mxnet::TShape, std::allocator > const&, std::vector<int, std::allocator > const&, std::vector<int, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::OpReqType, std::allocator > const&, std::unordered_set<std::string, std::hash, std::equal_to, std::allocator > const&, mxnet::Executor const, std::unordered_map<std::string, mxnet::NDArray, std::hash, std::equal_to, std::allocator<std::pair<std::string const, mxnet::NDArray> > >, std::vector<mxnet::NDArray, std::allocator >, std::vector<mxnet::NDArray, std::allocator >, std::vector<mxnet::NDArray, std::allocator >)+0xb10) [0x14b799cc0f60]\n [bt] (7) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(mxnet::exec::GraphExecutor::Init(nnvm::Symbol, mxnet::Context const&, std::map<std::string, mxnet::Context, std::less, std::allocator<std::pair<std::string const, mxnet::Context> > > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::unordered_map<std::string, mxnet::TShape, std::hash, std::equal_to, std::allocator<std::pair<std::string const, mxnet::TShape> > > const&, std::unordered_map<std::string, int, std::hash, std::equal_to, std::allocator<std::pair<std::string const, int> > > const&, std::unordered_map<std::string, int, std::hash, std::equal_to, std::allocator<std::pair<std::string const, int> > > const&, std::vector<mxnet::OpReqType, std::allocator > const&, std::unordered_set<std::string, std::hash, std::equal_to, std::allocator > const&, std::vector<mxnet::NDArray, std::allocator >, std::vector<mxnet::NDArray, std::allocator >, std::vector<mxnet::NDArray, std::allocator >, std::unordered_map<std::string, mxnet::NDArray, std::hash, std::equal_to, std::allocator<std::pair<std::string const, mxnet::NDArray> > >, mxnet::Executor, std::unordered_map<nnvm::NodeEntry, mxnet::NDArray, nnvm::NodeEntryHash, nnvm::NodeEntryEqual, std::allocator<std::pair<nnvm::NodeEntry const, mxnet::NDArray> > > const&)+0x6bc) [0x14b799ccf32c]\n [bt] (8) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(mxnet::Executor::SimpleBind(nnvm::Symbol, mxnet::Context const&, std::map<std::string, mxnet::Context, std::less, std::allocator<std::pair<std::string const, mxnet::Context> > > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::unordered_map<std::string, mxnet::TShape, std::hash, std::equal_to, std::allocator<std::pair<std::string const, mxnet::TShape> > > const&, std::unordered_map<std::string, int, std::hash, std::equal_to, std::allocator<std::pair<std::string const, int> > > const&, std::unordered_map<std::string, int, std::hash, std::equal_to, std::allocator<std::pair<std::string const, int> > > const&, std::vector<mxnet::OpReqType, std::allocator > const&, std::unordered_set<std::string, std::hash, std::equal_to, std::allocator > const&, std::vector<mxnet::NDArray, std::allocator >, std::vector<mxnet::NDArray, std::allocator >, std::vector<mxnet::NDArray, std::allocator >, std::unordered_map<std::string, mxnet::NDArray, std::hash, std::equal_to, std::allocator<std::pair<std::string const, mxnet::NDArray> > >, mxnet::Executor)+0x8a8) [0x14b799cd0258]\n\n\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"./src/services/facescan/plugins/mixins.py\", line 64, in _apply_face_plugins\n result_dto = plugin(face)\n File \"./src/services/facescan/plugins/mixins.py\", line 91, in call\n embedding=self.calc_embedding(face._face_img)\n File \"./src/services/facescan/plugins/insightface/insightface.py\", line 119, in calc_embedding\n return self._calculation_model.get_embedding(face_img).flatten()\n File \"/usr/local/lib/python3.7/dist-packages/cached_property.py\", line 36, in get\n value = obj.dict[self.func.name] = self.func(obj)\n File \"./src/services/facescan/plugins/insightface/insightface.py\", line 126, in _calculation_model\n model.prepare(ctx_id=self._CTX_ID)\n File \"/usr/local/lib/python3.7/dist-packages/insightface/model_zoo/face_recognition.py\", line 35, in prepare\n model.bind(data_shapes=[('data', data_shape)])\n File \"/usr/local/lib/python3.7/dist-packages/mxnet/module/module.py\", line 429, in bind\n state_names=self._state_names)\n File \"/usr/local/lib/python3.7/dist-packages/mxnet/module/executor_group.py\", line 279, in init\n self.bind_exec(data_shapes, label_shapes, shared_group)\n File \"/usr/local/lib/python3.7/dist-packages/mxnet/module/executor_group.py\", line 375, in bind_exec\n shared_group))\n File \"/usr/local/lib/python3.7/dist-packages/mxnet/module/executor_group.py\", line 662, in _bind_ith_exec\n shared_buffer=shared_data_arrays, input_shapes)\n File \"/usr/local/lib/python3.7/dist-packages/mxnet/symbol/symbol.py\", line 1629, in simple_bind\n raise RuntimeError(error_msg)\nRuntimeError: simple_bind error. Arguments:\ndata: (1, 3, 112, 112)\n[00:24:37] src/storage/./pooled_storage_manager.h:157: cudaMalloc failed: out of memory\nStack trace:\n [bt] (0) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(+0x4b04cb) [0x14b797b704cb]\n [bt] (1) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(+0x2e653eb) [0x14b79a5253eb]\n [bt] (2) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(+0x2e6af0f) [0x14b79a52af0f]\n [bt] (3) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(mxnet::NDArray::NDArray(mxnet::TShape const&, mxnet::Context, bool, int)+0x5d0) [0x14b799bf16d0]\n [bt] (4) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(mxnet::common::InitZeros(mxnet::NDArrayStorageType, mxnet::TShape const&, mxnet::Context const&, int)+0x5c) [0x14b799ca58ac]\n [bt] (5) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(mxnet::common::ReshapeOrCreate(std::string const&, mxnet::TShape const&, int, mxnet::NDArrayStorageType, mxnet::Context const&, std::unordered_map<std::string, mxnet::NDArray, std::hash, std::equal_to, std::allocator<std::pair<std::string const, mxnet::NDArray> > >, bool)+0x3a1) [0x14b799cb8f91]\n [bt] (6) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(mxnet::exec::GraphExecutor::InitArguments(nnvm::IndexedGraph const&, std::vector<mxnet::TShape, std::allocator > const&, std::vector<int, std::allocator > const&, std::vector<int, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::OpReqType, std::allocator > const&, std::unordered_set<std::string, std::hash, std::equal_to, std::allocator > const&, mxnet::Executor const, std::unordered_map<std::string, mxnet::NDArray, std::hash, std::equal_to, std::allocator<std::pair<std::string const, mxnet::NDArray> > >, std::vector<mxnet::NDArray, std::allocator >, std::vector<mxnet::NDArray, std::allocator >, std::vector<mxnet::NDArray, std::allocator >)+0xb10) [0x14b799cc0f60]\n [bt] (7) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(mxnet::exec::GraphExecutor::Init(nnvm::Symbol, mxnet::Context const&, std::map<std::string, mxnet::Context, std::less, std::allocator<std::pair<std::string const, mxnet::Context> > > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::unordered_map<std::string, mxnet::TShape, std::hash, std::equal_to, std::allocator<std::pair<std::string const, mxnet::TShape> > > const&, std::unordered_map<std::string, int, std::hash, std::equal_to, std::allocator<std::pair<std::string const, int> > > const&, std::unordered_map<std::string, int, std::hash, std::equal_to, std::allocator<std::pair<std::string const, int> > > const&, std::vector<mxnet::OpReqType, std::allocator > const&, std::unordered_set<std::string, std::hash, std::equal_to, std::allocator > const&, std::vector<mxnet::NDArray, std::allocator >, std::vector<mxnet::NDArray, std::allocator >, std::vector<mxnet::NDArray, std::allocator >, std::unordered_map<std::string, mxnet::NDArray, std::hash, std::equal_to, std::allocator<std::pair<std::string const, mxnet::NDArray> > >, mxnet::Executor, std::unordered_map<nnvm::NodeEntry, mxnet::NDArray, nnvm::NodeEntryHash, nnvm::NodeEntryEqual, std::allocator<std::pair<nnvm::NodeEntry const, mxnet::NDArray> > > const&)+0x6bc) [0x14b799ccf32c]\n [bt] (8) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(mxnet::Executor::SimpleBind(nnvm::Symbol, mxnet::Context const&, std::map<std::string, mxnet::Context, std::less, std::allocator<std::pair<std::string const, mxnet::Context> > > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::unordered_map<std::string, mxnet::TShape, std::hash, std::equal_to, std::allocator<std::pair<std::string const, mxnet::TShape> > > const&, std::unordered_map<std::string, int, std::hash, std::equal_to, std::allocator<std::pair<std::string const, int> > > const&, std::unordered_map<std::string, int, std::hash, std::equal_to, std::allocator<std::pair<std::string const, int> > > const&, std::vector<mxnet::OpReqType, std::allocator > const&, std::unordered_set<std::string, std::hash, std::equal_to, std::allocator > const&, std::vector<mxnet::NDArray, std::allocator >, std::vector<mxnet::NDArray, std::allocator >, std::vector<mxnet::NDArray, std::allocator >, std::unordered_map<std::string, mxnet::NDArray, std::hash, std::equal_to, std::allocator<std::pair<std::string const, mxnet::NDArray> > >, mxnet::Executor)+0x8a8) [0x14b799cd0258]\n\n\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/usr/local/lib/python3.7/dist-packages/flask/app.py\", line 1950, in full_dispatch_request\n rv = self.dispatch_request()\n File \"/usr/local/lib/python3.7/dist-packages/flask/app.py\", line 1936, in dispatch_request\n return self.view_functions[rule.endpoint](req.viewargs)\n File \"./src/services/flask/needs_attached_file.py\", line 32, in wrapper\n return f(*args, *kwargs)\n File \"./src/_endpoints.py\", line 72, in find_faces_post\n face_plugins=face_plugins\n File \"./src/services/facescan/plugins/mixins.py\", line 46, in call\n self._apply_face_plugins(face, face_plugins)\n File \"./src/services/facescan/plugins/mixins.py\", line 67, in _apply_face_plugins\n raise exceptions.PluginError(f'{plugin} error - {e}')\nsrc.services.facescan.plugins.exceptions.PluginError: insightface.Calculator@arcface-r100-msfdrop75 error - simple_bind error. Arguments:\ndata: (1, 3, 112, 112)\n[00:24:37] src/storage/./pooled_storage_manager.h:157: cudaMalloc failed: out of memory\nStack trace:\n [bt] (0) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(+0x4b04cb) [0x14b797b704cb]\n [bt] (1) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(+0x2e653eb) [0x14b79a5253eb]\n [bt] (2) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(+0x2e6af0f) [0x14b79a52af0f]\n [bt] (3) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(mxnet::NDArray::NDArray(mxnet::TShape const&, mxnet::Context, bool, int)+0x5d0) [0x14b799bf16d0]\n [bt] (4) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(mxnet::common::InitZeros(mxnet::NDArrayStorageType, mxnet::TShape const&, mxnet::Context const&, int)+0x5c) [0x14b799ca58ac]\n [bt] (5) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(mxnet::common::ReshapeOrCreate(std::string const&, mxnet::TShape const&, int, mxnet::NDArrayStorageType, mxnet::Context const&, std::unordered_map<std::string, mxnet::NDArray, std::hash, std::equal_to, std::allocator<std::pair<std::string const, mxnet::NDArray> > >, bool)+0x3a1) [0x14b799cb8f91]\n [bt] (6) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(mxnet::exec::GraphExecutor::InitArguments(nnvm::IndexedGraph const&, std::vector<mxnet::TShape, std::allocator > const&, std::vector<int, std::allocator > const&, std::vector<int, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::OpReqType, std::allocator > const&, std::unordered_set<std::string, std::hash, std::equal_to, std::allocator > const&, mxnet::Executor const, std::unordered_map<std::string, mxnet::NDArray, std::hash, std::equal_to, std::allocator<std::pair<std::string const, mxnet::NDArray> > >, std::vector<mxnet::NDArray, std::allocator >, std::vector<mxnet::NDArray, std::allocator >, std::vector<mxnet::NDArray, std::allocator >)+0xb10) [0x14b799cc0f60]\n [bt] (7) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(mxnet::exec::GraphExecutor::Init(nnvm::Symbol, mxnet::Context const&, std::map<std::string, mxnet::Context, std::less, std::allocator<std::pair<std::string const, mxnet::Context> > > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::unordered_map<std::string, mxnet::TShape, std::hash, std::equal_to, std::allocator<std::pair<std::string const, mxnet::TShape> > > const&, std::unordered_map<std::string, int, std::hash, std::equal_to, std::allocator<std::pair<std::string const, int> > > const&, std::unordered_map<std::string, int, std::hash, std::equal_to, std::allocator<std::pair<std::string const, int> > > const&, std::vector<mxnet::OpReqType, std::allocator > const&, std::unordered_set<std::string, std::hash, std::equal_to, std::allocator > const&, std::vector<mxnet::NDArray, std::allocator >, std::vector<mxnet::NDArray, std::allocator >, std::vector<mxnet::NDArray, std::allocator >, std::unordered_map<std::string, mxnet::NDArray, std::hash, std::equal_to, std::allocator<std::pair<std::string const, mxnet::NDArray> > >, mxnet::Executor, std::unordered_map<nnvm::NodeEntry, mxnet::NDArray, nnvm::NodeEntryHash, nnvm::NodeEntryEqual, std::allocator<std::pair<nnvm::NodeEntry const, mxnet::NDArray> > > const&)+0x6bc) [0x14b799ccf32c]\n [bt] (8) /usr/local/lib/python3.7/dist-packages/mxnet/libmxnet.so(mxnet::Executor::SimpleBind(nnvm::Symbol, mxnet::Context const&, std::map<std::string, mxnet::Context, std::less, std::allocator<std::pair<std::string const, mxnet::Context> > > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::unordered_map<std::string, mxnet::TShape, std::hash, std::equal_to, std::allocator<std::pair<std::string const, mxnet::TShape> > > const&, std::unordered_map<std::string, int, std::hash, std::equal_to, std::allocator<std::pair<std::string const, int> > > const&, std::unordered_map<std::string, int, std::hash, std::equal_to, std::allocator<std::pair<std::string const, int> > > const&, std::vector<mxnet::OpReqType, std::allocator > const&, std::unordered_set<std::string, std::hash, std::equal_to, std::allocator > const&, std::vector<mxnet::NDArray, std::allocator >, std::vector<mxnet::NDArray, std::allocator >, std::vector<mxnet::NDArray, std::allocator >, std::unordered_map<std::string, mxnet::NDArray, std::hash, std::equal_to, std::allocator<std::pair<std::string const, mxnet::NDArray> > >, mxnet::Executor*)+0x8a8) [0x14b799cd0258]\n\n\n", "build_version": "dev"} 2022-06-04 00:24:37.174 ERROR 187 --- [nio-8080-exec-3] c.e.f.c.h.ResponseExceptionHandler : Defined exception occurred com.exadel.frs.commonservice.sdk.faces.exception.FacesServiceException: Error during synchronization between servers: [500 INTERNAL SERVER ERROR] during [POST] to [http://localhost:3000/find_faces] [FacesFeignClient#findFaces(MultipartFile,Integer,Double,String)]: [{"message":"PluginError: insightface.Calculator@arcface-r100-msfdrop75 error - simple_bind error. Arguments:\ndata: (1, 3, 112, 112)\n[00:20:37] src/storage/./pooled_storage_manager.h:157: cudaMalloc ... (5436 bytes)] at com.exadel.frs.commonservice.sdk.faces.service.FacesRestApiClient.findWithCalculator(FacesRestApiClient.java:91) at com.exadel.frs.commonservice.sdk.faces.service.FacesRestApiClient.findFacesWithCalculator(FacesRestApiClient.java:57) at com.exadel.frs.commonservice.sdk.faces.service.FacesRestApiClient$$FastClassBySpringCGLIB$$517e8caf.invoke() at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218) at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:687) at com.exadel.frs.commonservice.sdk.faces.service.FacesRestApiClient$$EnhancerBySpringCGLIB$$45300267.findFacesWithCalculator() at com.exadel.frs.core.trainservice.service.FaceRecognizeProcessServiceImpl.processImage(FaceRecognizeProcessServiceImpl.java:49) at com.exadel.frs.core.trainservice.service.FaceRecognizeProcessServiceImpl.processImage(FaceRecognizeProcessServiceImpl.java:26) at com.exadel.frs.core.trainservice.controller.RecognizeController.recognize(RecognizeController.java:75) at com.exadel.frs.core.trainservice.controller.RecognizeController$$FastClassBySpringCGLIB$$52b4c4f5.invoke() at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218) at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:771) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:749) at org.springframework.validation.beanvalidation.MethodValidationInterceptor.invoke(MethodValidationInterceptor.java:119) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:749) at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:691) at com.exadel.frs.core.trainservice.controller.RecognizeController$$EnhancerBySpringCGLIB$$673f2f8f.recognize() at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:190) at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:138) at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:105) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:878) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:792) at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87) at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1040) at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:943) at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1006) at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:909) at javax.servlet.http.HttpServlet.service(HttpServlet.java:652) at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:883) at javax.servlet.http.HttpServlet.service(HttpServlet.java:733) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at com.exadel.frs.core.trainservice.filter.SecurityValidationFilter.doFilter(SecurityValidationFilter.java:124) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:100) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:93) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.springframework.boot.actuate.metrics.web.servlet.WebMvcMetricsFilter.doFilterInternal(WebMvcMetricsFilter.java:93) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:201) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:202) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:97) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:541) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:143) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:78) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:343) at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:374) at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65) at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:868) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1590) at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.base/java.lang.Thread.run(Thread.java:829)

172.17.0.1 - - [04/Jun/2022:00:20:37 +0200] "POST /api/v1/recognition/recognize?face_plugins=undefined&det_prob_threshold=0.8 HTTP/1.1" 500 463 "-" "axios/0.27.2"`

Docker:

koldogut commented 2 years ago

Version 0.6.1 has the same behavior while MobileNet works fine and is taking just 780Mb of RAM in ver 1.0.0, ¿Is not 2GB enough? Arc-100 seems to be consuming 1784Mb of 2048 in total.

pospielov commented 2 years ago

if MobileNet works fine, then probably 2Gb memory is not enough. Those two builds are identical, except for the model.

koldogut commented 2 years ago

True, moved to a 4Gb Nvidia card and works fine.