exadel-inc / CompreFace

Leading free and open-source face recognition system
https://exadel.com/accelerator-showcase/compreface/
Apache License 2.0
5.22k stars 719 forks source link

500 Internal Server Error: The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application #1201

Open Dvalin21 opened 10 months ago

Dvalin21 commented 10 months ago

500 Internal Server

After running compreface for several weeks, it just stops connecting. Admin node starts, core and api stays at "loading"

Desktop (please complete the following information):

Pastbin with logs: https://pastebin.com/H0FvXkeX

Run those commands and attach result to the ticket:

docker ps

CONTAINER ID   IMAGE                                           COMMAND                  CREATED          STATUS          PORTS                                                                                                                                                                                            NAMES
52886e076898   exadel/compreface-core:1.2.0-arcface-r100-gpu   "/opt/nvidia/nvidia_…"   26 minutes ago   Up 59 seconds   3000/tcp                                                                                                                                                                                         compreface-core
5029f78c50c5   skrashevich/double-take:1.13.10                 "/bin/bash ./entrypo…"   6 days ago       Up 5 hours      0.0.0.0:3000->3000/tcp, :::3000->3000/tcp                                                                                                                                                        Frigate-Doubletake
f8fcbeccb0a8   ghcr.io/blakeblackshear/frigate:stable          "/init"                  6 days ago       Up 5 hours      0.0.0.0:1935->1935/tcp, :::1935->1935/tcp, 0.0.0.0:5000->5000/tcp, :::5000->5000/tcp, 0.0.0.0:8554-8555->8554-8555/tcp, :::8554-8555->8554-8555/tcp, 0.0.0.0:8555->8555/udp, :::8555->8555/udp   Frigate
fb3398b4b31b   exadel/compreface-fe:1.2.0                      "/docker-entrypoint.…"   13 days ago      Up 53 seconds   0.0.0.0:8001->80/tcp, :::8001->80/tcp                                                                                                                                                            compreface-ui
26cd65d24261   exadel/compreface-admin:1.2.0                   "sh -c 'java $ADMIN_…"   13 days ago      Up 56 seconds                                                                                                                                                                                                    compreface-admin
1f23d5bd9a3a   exadel/compreface-api:1.2.0                     "sh -c 'java $API_JA…"   13 days ago      Up 54 seconds                                                                                                                                                                                                    compreface-api
9ebbe3d57068   exadel/compreface-postgres-db:1.2.0             "docker-entrypoint.s…"   13 days ago      Up 52 seconds   5432/tcp                                                                                                                                                                                         compreface-postgres-db
7a933549b781   eclipse-mosquitto:latest                        "/docker-entrypoint.…"   13 days ago      Up 5 hours      0.0.0.0:1883->1883/tcp, :::1883->1883/tcp, 0.0.0.0:9001->9001/tcp, :::9001->9001/tcp                                                                                                             Frigate-Mqtt
e7e908560c54   portainer/portainer-ce:latest                   "/portainer"             13 days ago      Up 5 hours      0.0.0.0:8000->8000/tcp, :::8000->8000/tcp, 0.0.0.0:9443->9443/tcp, :::9443->9443/tcp, 9000/tcp                                                                                                   portainer

docker-compose logs

tyanai commented 9 months ago

I also see this exact issue. From compreface-core log:

compreface-core | Traceback (most recent call last): compreface-core | File "/usr/local/lib/python3.8/dist-packages/mxnet/symbol/symbol.py", line 1903, in simple_bind compreface-core | check_call(_LIB.MXExecutorSimpleBindEx(self.handle, compreface-core | File "/usr/local/lib/python3.8/dist-packages/mxnet/base.py", line 246, in check_call compreface-core | raise get_last_ffi_error() compreface-core | mxnet.base.MXNetError: Traceback (most recent call last): compreface-core | File "/work/mxnet/src/storage/storage.cc", line 97 compreface-core | CUDA: Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading: no CUDA-capable device is detected compreface-core | compreface-core | During handling of the above exception, another exception occurred: compreface-core | compreface-core | Traceback (most recent call last): compreface-core | File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 2447, in wsgi_app compreface-core | response = self.full_dispatch_request() compreface-core | File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 1945, in full_dispatch_request compreface-core | self.try_trigger_before_first_request_functions() compreface-core | File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 1993, in try_trigger_before_first_request_functions compreface-core | func() compreface-core | File "/app/ml/./src/_endpoints.py", line 52, in init_model compreface-core | detector( compreface-core | File "/app/ml/./src/services/facescan/plugins/mixins.py", line 46, in call compreface-core | faces = self._fetch_faces(img, det_prob_threshold) compreface-core | File "/app/ml/./src/services/facescan/plugins/mixins.py", line 53, in _fetch_faces compreface-core | boxes = self.find_faces(img, det_prob_threshold) compreface-core | File "/app/ml/./src/services/facescan/plugins/insightface/insightface.py", line 103, in find_faces compreface-core | model = self._detection_model compreface-core | File "/usr/local/lib/python3.8/dist-packages/cached_property.py", line 36, in get compreface-core | value = obj.dict[self.func.name] = self.func(obj) compreface-core | File "/app/ml/./src/services/facescan/plugins/insightface/insightface.py", line 80, in _detection_model compreface-core | model.prepare(ctx_id=self._CTX_ID, nms=self._NMS) compreface-core | File "/usr/local/lib/python3.8/dist-packages/insightface/app/face_analysis.py", line 32, in prepare compreface-core | self.det_model.prepare(ctx_id, nms) compreface-core | File "/usr/local/lib/python3.8/dist-packages/insightface/model_zoo/face_detection.py", line 217, in prepare compreface-core | model.bind(data_shapes=[('data', data_shape)]) compreface-core | File "/usr/local/lib/python3.8/dist-packages/mxnet/module/module.py", line 422, in bind compreface-core | self._exec_group = DataParallelExecutorGroup(self._symbol, self._context, compreface-core | File "/usr/local/lib/python3.8/dist-packages/mxnet/module/executor_group.py", line 280, in init compreface-core | self.bind_exec(data_shapes, label_shapes, shared_group) compreface-core | File "/usr/local/lib/python3.8/dist-packages/mxnet/module/executor_group.py", line 383, in bind_exec compreface-core | self.execs.append(self._bind_ith_exec(i, data_shapes_i, label_shapes_i, compreface-core | File "/usr/local/lib/python3.8/dist-packages/mxnet/module/executor_group.py", line 675, in _bind_ith_exec compreface-core | executor = self.symbol.simple_bind(ctx=context, grad_req=self.grad_req, compreface-core | File "/usr/local/lib/python3.8/dist-packages/mxnet/symbol/symbol.py", line 1944, in simple_bind compreface-core | raise RuntimeError(error_msg) compreface-core | RuntimeError: simple_bind error. Arguments: compreface-core | data: (1, 3, 480, 640) compreface-core | Traceback (most recent call last): compreface-core | File "/work/mxnet/src/storage/storage.cc", line 97 compreface-core | CUDA: Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading: no CUDA-capable device is detected compreface-core | {"severity": "WARNING", "message": "500 Internal Server Error: The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.", "request": {"method": "GET", "path": "/status", "filename": "", "api_key": "", "remote_addr": "172.18.0.4"}, "logger": "root", "module": "error_handling", "traceback": "Traceback (most recent call last):\n File \"/usr/local/lib/python3.8/dist-packages/mxnet/symbol/symbol.py\", line 1903, in simple_bind\n check_call(_LIB.MXExecutorSimpleBindEx(self.handle,\n File \"/usr/local/lib/python3.8/dist-packages/mxnet/base.py\", line 246, in check_call\n raise get_last_ffi_error()\nmxnet.base.MXNetError: Traceback (most recent call last):\n File \"/work/mxnet/src/storage/storage.cc\", line 97\nCUDA: Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading: no CUDA-capable device is detected\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/usr/local/lib/python3.8/dist-packages/flask/app.py\", line 2447, in wsgi_app\n response = self.full_dispatch_request()\n File \"/usr/local/lib/python3.8/dist-packages/flask/app.py\", line 1945, in full_dispatch_request\n self.try_trigger_before_first_request_functions()\n File \"/usr/local/lib/python3.8/dist-packages/flask/app.py\", line 1993, in try_trigger_before_first_request_functions\n func()\n File \"/app/ml/./src/_endpoints.py\", line 52, in init_model\n detector(\n File \"/app/ml/./src/services/facescan/plugins/mixins.py\", line 46, in call\n faces = self._fetch_faces(img, det_prob_threshold)\n File \"/app/ml/./src/services/facescan/plugins/mixins.py\", line 53, in _fetch_faces\n boxes = self.find_faces(img, det_prob_threshold)\n File \"/app/ml/./src/services/facescan/plugins/insightface/insightface.py\", line 103, in find_faces\n model = self._detection_model\n File \"/usr/local/lib/python3.8/dist-packages/cached_property.py\", line 36, in get\n value = obj.dict[self.func.name] = self.func(obj)\n File \"/app/ml/./src/services/facescan/plugins/insightface/insightface.py\", line 80, in _detection_model\n model.prepare(ctx_id=self._CTX_ID, nms=self._NMS)\n File \"/usr/local/lib/python3.8/dist-packages/insightface/app/face_analysis.py\", line 32, in prepare\n self.det_model.prepare(ctx_id, nms)\n File \"/usr/local/lib/python3.8/dist-packages/insightface/model_zoo/face_detection.py\", line 217, in prepare\n model.bind(data_shapes=[('data', data_shape)])\n File \"/usr/local/lib/python3.8/dist-packages/mxnet/module/module.py\", line 422, in bind\n self._exec_group = DataParallelExecutorGroup(self._symbol, self._context,\n File \"/usr/local/lib/python3.8/dist-packages/mxnet/module/executor_group.py\", line 280, in init\n self.bind_exec(data_shapes, label_shapes, shared_group)\n File \"/usr/local/lib/python3.8/dist-packages/mxnet/module/executor_group.py\", line 383, in bind_exec\n self.execs.append(self._bind_ith_exec(i, data_shapes_i, label_shapes_i,\n File \"/usr/local/lib/python3.8/dist-packages/mxnet/module/executor_group.py\", line 675, in _bind_ith_exec\n executor = self.symbol.simple_bind(ctx=context, grad_req=self.grad_req,\n File \"/usr/local/lib/python3.8/dist-packages/mxnet/symbol/symbol.py\", line 1944, in simple_bind\n raise RuntimeError(error_msg)\nRuntimeError: simple_bind error. Arguments:\ndata: (1, 3, 480, 640)\nTraceback (most recent call last):\n File \"/work/mxnet/src/storage/storage.cc\", line 97\nCUDA: Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading: no CUDA-capable device is detected\n", "build_version": "dev"}

tyanai commented 9 months ago

The core-api log shows this exception: com.exadel.frs.commonservice.sdk.faces.exception.FacesServiceException: Error during synchronization between servers: [500 INTERNAL SERVER ERROR] during [GET] to [http://compreface-core:3000/status] [FacesFeignClient#getStatus()]: [{"message":"500 Internal Server Error: The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application."} compreface-api | ] compreface-api | at com.exadel.frs.commonservice.sdk.faces.service.FacesRestApiClient.getStatus(FacesRestApiClient.java:101) compreface-api | at com.exadel.frs.commonservice.sdk.faces.service.FacesRestApiClient$$FastClassBySpringCGLIB$$517e8caf.invoke() compreface-api | at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218) compreface-api | at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:793) compreface-api | at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) compreface-api | at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:763) compreface-api | at org.springframework.cache.interceptor.CacheInterceptor.lambda$invoke$0(CacheInterceptor.java:54) compreface-api | at org.springframework.cache.interceptor.CacheAspectSupport.invokeOperation(CacheAspectSupport.java:366) compreface-api | at org.springframework.cache.interceptor.CacheAspectSupport.execute(CacheAspectSupport.java:421) compreface-api | at org.springframework.cache.interceptor.CacheAspectSupport.execute(CacheAspectSupport.java:345) compreface-api | at org.springframework.cache.interceptor.CacheInterceptor.invoke(CacheInterceptor.java:64) compreface-api | at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) compreface-api | at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:763) compreface-api | at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:708) compreface-api | at com.exadel.frs.commonservice.sdk.faces.service.FacesRestApiClient$$EnhancerBySpringCGLIB$$cb805c15.getStatus() compreface-api | at com.exadel.frs.core.trainservice.controller.ConsistenceController.getCheckDemo(ConsistenceController.java:25) compreface-api | at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) compreface-api | at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) compreface-api | at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) compreface-api | at java.base/java.lang.reflect.Method.invoke(Unknown Source) compreface-api | at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205) compreface-api | at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:150) compreface-api | at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:117) compreface-api | at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:895) compreface-api | at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:808) compreface-api | at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87) compreface-api | at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1067) compreface-api | at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:963) compreface-api | at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1006) compreface-api | at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:898) compreface-api | at javax.servlet.http.HttpServlet.service(HttpServlet.java:655) compreface-api | at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:883) compreface-api | at javax.servlet.http.HttpServlet.service(HttpServlet.java:764) compreface-api | at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:227) compreface-api | at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162) compreface-api | at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53) compreface-api | at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189) compreface-api | at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162) compreface-api | at com.exadel.frs.core.trainservice.filter.SecurityValidationFilter.doFilter(SecurityValidationFilter.java:134) compreface-api | at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189) compreface-api | at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162) compreface-api | at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:100) compreface-api | at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117) compreface-api | at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189) compreface-api | at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162) compreface-api | at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:93) compreface-api | at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117) compreface-api | at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189) compreface-api | at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162) compreface-api | at org.springframework.boot.actuate.metrics.web.servlet.WebMvcMetricsFilter.doFilterInternal(WebMvcMetricsFilter.java:96) compreface-api | at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117) compreface-api | at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189) compreface-api | at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162) compreface-api | at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:201) compreface-api | at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117) compreface-api | at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189) compreface-api | at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162) compreface-api | at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:197) compreface-api | at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:97) compreface-api | at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:541) compreface-api | at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:135) compreface-api | at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92) compreface-api | at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:78) compreface-api | at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:360) compreface-api | at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:399) compreface-api | at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65) compreface-api | at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:890) compreface-api | at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1743) compreface-api | at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49) compreface-api | at org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1191) compreface-api | at org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659) compreface-api | at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) compreface-api | at java.base/java.lang.Thread.run(Unknown Source)