exadel-inc / CompreFace

Leading free and open-source face recognition system
https://exadel.com/accelerator-showcase/compreface/
Apache License 2.0
5.66k stars 772 forks source link

500 Internal Server Error: The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application #1201

Open Dvalin21 opened 1 year ago

Dvalin21 commented 1 year ago

500 Internal Server

After running compreface for several weeks, it just stops connecting. Admin node starts, core and api stays at "loading"

Desktop (please complete the following information):

Pastbin with logs: https://pastebin.com/H0FvXkeX

Run those commands and attach result to the ticket:

docker ps

CONTAINER ID   IMAGE                                           COMMAND                  CREATED          STATUS          PORTS                                                                                                                                                                                            NAMES
52886e076898   exadel/compreface-core:1.2.0-arcface-r100-gpu   "/opt/nvidia/nvidia_…"   26 minutes ago   Up 59 seconds   3000/tcp                                                                                                                                                                                         compreface-core
5029f78c50c5   skrashevich/double-take:1.13.10                 "/bin/bash ./entrypo…"   6 days ago       Up 5 hours      0.0.0.0:3000->3000/tcp, :::3000->3000/tcp                                                                                                                                                        Frigate-Doubletake
f8fcbeccb0a8   ghcr.io/blakeblackshear/frigate:stable          "/init"                  6 days ago       Up 5 hours      0.0.0.0:1935->1935/tcp, :::1935->1935/tcp, 0.0.0.0:5000->5000/tcp, :::5000->5000/tcp, 0.0.0.0:8554-8555->8554-8555/tcp, :::8554-8555->8554-8555/tcp, 0.0.0.0:8555->8555/udp, :::8555->8555/udp   Frigate
fb3398b4b31b   exadel/compreface-fe:1.2.0                      "/docker-entrypoint.…"   13 days ago      Up 53 seconds   0.0.0.0:8001->80/tcp, :::8001->80/tcp                                                                                                                                                            compreface-ui
26cd65d24261   exadel/compreface-admin:1.2.0                   "sh -c 'java $ADMIN_…"   13 days ago      Up 56 seconds                                                                                                                                                                                                    compreface-admin
1f23d5bd9a3a   exadel/compreface-api:1.2.0                     "sh -c 'java $API_JA…"   13 days ago      Up 54 seconds                                                                                                                                                                                                    compreface-api
9ebbe3d57068   exadel/compreface-postgres-db:1.2.0             "docker-entrypoint.s…"   13 days ago      Up 52 seconds   5432/tcp                                                                                                                                                                                         compreface-postgres-db
7a933549b781   eclipse-mosquitto:latest                        "/docker-entrypoint.…"   13 days ago      Up 5 hours      0.0.0.0:1883->1883/tcp, :::1883->1883/tcp, 0.0.0.0:9001->9001/tcp, :::9001->9001/tcp                                                                                                             Frigate-Mqtt
e7e908560c54   portainer/portainer-ce:latest                   "/portainer"             13 days ago      Up 5 hours      0.0.0.0:8000->8000/tcp, :::8000->8000/tcp, 0.0.0.0:9443->9443/tcp, :::9443->9443/tcp, 9000/tcp                                                                                                   portainer

docker-compose logs

tyanai commented 11 months ago

I also see this exact issue. From compreface-core log:

compreface-core | Traceback (most recent call last): compreface-core | File "/usr/local/lib/python3.8/dist-packages/mxnet/symbol/symbol.py", line 1903, in simple_bind compreface-core | check_call(_LIB.MXExecutorSimpleBindEx(self.handle, compreface-core | File "/usr/local/lib/python3.8/dist-packages/mxnet/base.py", line 246, in check_call compreface-core | raise get_last_ffi_error() compreface-core | mxnet.base.MXNetError: Traceback (most recent call last): compreface-core | File "/work/mxnet/src/storage/storage.cc", line 97 compreface-core | CUDA: Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading: no CUDA-capable device is detected compreface-core | compreface-core | During handling of the above exception, another exception occurred: compreface-core | compreface-core | Traceback (most recent call last): compreface-core | File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 2447, in wsgi_app compreface-core | response = self.full_dispatch_request() compreface-core | File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 1945, in full_dispatch_request compreface-core | self.try_trigger_before_first_request_functions() compreface-core | File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 1993, in try_trigger_before_first_request_functions compreface-core | func() compreface-core | File "/app/ml/./src/_endpoints.py", line 52, in init_model compreface-core | detector( compreface-core | File "/app/ml/./src/services/facescan/plugins/mixins.py", line 46, in call compreface-core | faces = self._fetch_faces(img, det_prob_threshold) compreface-core | File "/app/ml/./src/services/facescan/plugins/mixins.py", line 53, in _fetch_faces compreface-core | boxes = self.find_faces(img, det_prob_threshold) compreface-core | File "/app/ml/./src/services/facescan/plugins/insightface/insightface.py", line 103, in find_faces compreface-core | model = self._detection_model compreface-core | File "/usr/local/lib/python3.8/dist-packages/cached_property.py", line 36, in get compreface-core | value = obj.dict[self.func.name] = self.func(obj) compreface-core | File "/app/ml/./src/services/facescan/plugins/insightface/insightface.py", line 80, in _detection_model compreface-core | model.prepare(ctx_id=self._CTX_ID, nms=self._NMS) compreface-core | File "/usr/local/lib/python3.8/dist-packages/insightface/app/face_analysis.py", line 32, in prepare compreface-core | self.det_model.prepare(ctx_id, nms) compreface-core | File "/usr/local/lib/python3.8/dist-packages/insightface/model_zoo/face_detection.py", line 217, in prepare compreface-core | model.bind(data_shapes=[('data', data_shape)]) compreface-core | File "/usr/local/lib/python3.8/dist-packages/mxnet/module/module.py", line 422, in bind compreface-core | self._exec_group = DataParallelExecutorGroup(self._symbol, self._context, compreface-core | File "/usr/local/lib/python3.8/dist-packages/mxnet/module/executor_group.py", line 280, in init compreface-core | self.bind_exec(data_shapes, label_shapes, shared_group) compreface-core | File "/usr/local/lib/python3.8/dist-packages/mxnet/module/executor_group.py", line 383, in bind_exec compreface-core | self.execs.append(self._bind_ith_exec(i, data_shapes_i, label_shapes_i, compreface-core | File "/usr/local/lib/python3.8/dist-packages/mxnet/module/executor_group.py", line 675, in _bind_ith_exec compreface-core | executor = self.symbol.simple_bind(ctx=context, grad_req=self.grad_req, compreface-core | File "/usr/local/lib/python3.8/dist-packages/mxnet/symbol/symbol.py", line 1944, in simple_bind compreface-core | raise RuntimeError(error_msg) compreface-core | RuntimeError: simple_bind error. Arguments: compreface-core | data: (1, 3, 480, 640) compreface-core | Traceback (most recent call last): compreface-core | File "/work/mxnet/src/storage/storage.cc", line 97 compreface-core | CUDA: Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading: no CUDA-capable device is detected compreface-core | {"severity": "WARNING", "message": "500 Internal Server Error: The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.", "request": {"method": "GET", "path": "/status", "filename": "", "api_key": "", "remote_addr": "172.18.0.4"}, "logger": "root", "module": "error_handling", "traceback": "Traceback (most recent call last):\n File \"/usr/local/lib/python3.8/dist-packages/mxnet/symbol/symbol.py\", line 1903, in simple_bind\n check_call(_LIB.MXExecutorSimpleBindEx(self.handle,\n File \"/usr/local/lib/python3.8/dist-packages/mxnet/base.py\", line 246, in check_call\n raise get_last_ffi_error()\nmxnet.base.MXNetError: Traceback (most recent call last):\n File \"/work/mxnet/src/storage/storage.cc\", line 97\nCUDA: Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading: no CUDA-capable device is detected\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/usr/local/lib/python3.8/dist-packages/flask/app.py\", line 2447, in wsgi_app\n response = self.full_dispatch_request()\n File \"/usr/local/lib/python3.8/dist-packages/flask/app.py\", line 1945, in full_dispatch_request\n self.try_trigger_before_first_request_functions()\n File \"/usr/local/lib/python3.8/dist-packages/flask/app.py\", line 1993, in try_trigger_before_first_request_functions\n func()\n File \"/app/ml/./src/_endpoints.py\", line 52, in init_model\n detector(\n File \"/app/ml/./src/services/facescan/plugins/mixins.py\", line 46, in call\n faces = self._fetch_faces(img, det_prob_threshold)\n File \"/app/ml/./src/services/facescan/plugins/mixins.py\", line 53, in _fetch_faces\n boxes = self.find_faces(img, det_prob_threshold)\n File \"/app/ml/./src/services/facescan/plugins/insightface/insightface.py\", line 103, in find_faces\n model = self._detection_model\n File \"/usr/local/lib/python3.8/dist-packages/cached_property.py\", line 36, in get\n value = obj.dict[self.func.name] = self.func(obj)\n File \"/app/ml/./src/services/facescan/plugins/insightface/insightface.py\", line 80, in _detection_model\n model.prepare(ctx_id=self._CTX_ID, nms=self._NMS)\n File \"/usr/local/lib/python3.8/dist-packages/insightface/app/face_analysis.py\", line 32, in prepare\n self.det_model.prepare(ctx_id, nms)\n File \"/usr/local/lib/python3.8/dist-packages/insightface/model_zoo/face_detection.py\", line 217, in prepare\n model.bind(data_shapes=[('data', data_shape)])\n File \"/usr/local/lib/python3.8/dist-packages/mxnet/module/module.py\", line 422, in bind\n self._exec_group = DataParallelExecutorGroup(self._symbol, self._context,\n File \"/usr/local/lib/python3.8/dist-packages/mxnet/module/executor_group.py\", line 280, in init\n self.bind_exec(data_shapes, label_shapes, shared_group)\n File \"/usr/local/lib/python3.8/dist-packages/mxnet/module/executor_group.py\", line 383, in bind_exec\n self.execs.append(self._bind_ith_exec(i, data_shapes_i, label_shapes_i,\n File \"/usr/local/lib/python3.8/dist-packages/mxnet/module/executor_group.py\", line 675, in _bind_ith_exec\n executor = self.symbol.simple_bind(ctx=context, grad_req=self.grad_req,\n File \"/usr/local/lib/python3.8/dist-packages/mxnet/symbol/symbol.py\", line 1944, in simple_bind\n raise RuntimeError(error_msg)\nRuntimeError: simple_bind error. Arguments:\ndata: (1, 3, 480, 640)\nTraceback (most recent call last):\n File \"/work/mxnet/src/storage/storage.cc\", line 97\nCUDA: Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading: no CUDA-capable device is detected\n", "build_version": "dev"}

tyanai commented 11 months ago

The core-api log shows this exception: com.exadel.frs.commonservice.sdk.faces.exception.FacesServiceException: Error during synchronization between servers: [500 INTERNAL SERVER ERROR] during [GET] to [http://compreface-core:3000/status] [FacesFeignClient#getStatus()]: [{"message":"500 Internal Server Error: The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application."} compreface-api | ] compreface-api | at com.exadel.frs.commonservice.sdk.faces.service.FacesRestApiClient.getStatus(FacesRestApiClient.java:101) compreface-api | at com.exadel.frs.commonservice.sdk.faces.service.FacesRestApiClient$$FastClassBySpringCGLIB$$517e8caf.invoke() compreface-api | at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218) compreface-api | at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:793) compreface-api | at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) compreface-api | at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:763) compreface-api | at org.springframework.cache.interceptor.CacheInterceptor.lambda$invoke$0(CacheInterceptor.java:54) compreface-api | at org.springframework.cache.interceptor.CacheAspectSupport.invokeOperation(CacheAspectSupport.java:366) compreface-api | at org.springframework.cache.interceptor.CacheAspectSupport.execute(CacheAspectSupport.java:421) compreface-api | at org.springframework.cache.interceptor.CacheAspectSupport.execute(CacheAspectSupport.java:345) compreface-api | at org.springframework.cache.interceptor.CacheInterceptor.invoke(CacheInterceptor.java:64) compreface-api | at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) compreface-api | at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:763) compreface-api | at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:708) compreface-api | at com.exadel.frs.commonservice.sdk.faces.service.FacesRestApiClient$$EnhancerBySpringCGLIB$$cb805c15.getStatus() compreface-api | at com.exadel.frs.core.trainservice.controller.ConsistenceController.getCheckDemo(ConsistenceController.java:25) compreface-api | at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) compreface-api | at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) compreface-api | at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) compreface-api | at java.base/java.lang.reflect.Method.invoke(Unknown Source) compreface-api | at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205) compreface-api | at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:150) compreface-api | at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:117) compreface-api | at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:895) compreface-api | at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:808) compreface-api | at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87) compreface-api | at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1067) compreface-api | at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:963) compreface-api | at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1006) compreface-api | at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:898) compreface-api | at javax.servlet.http.HttpServlet.service(HttpServlet.java:655) compreface-api | at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:883) compreface-api | at javax.servlet.http.HttpServlet.service(HttpServlet.java:764) compreface-api | at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:227) compreface-api | at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162) compreface-api | at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53) compreface-api | at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189) compreface-api | at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162) compreface-api | at com.exadel.frs.core.trainservice.filter.SecurityValidationFilter.doFilter(SecurityValidationFilter.java:134) compreface-api | at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189) compreface-api | at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162) compreface-api | at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:100) compreface-api | at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117) compreface-api | at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189) compreface-api | at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162) compreface-api | at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:93) compreface-api | at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117) compreface-api | at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189) compreface-api | at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162) compreface-api | at org.springframework.boot.actuate.metrics.web.servlet.WebMvcMetricsFilter.doFilterInternal(WebMvcMetricsFilter.java:96) compreface-api | at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117) compreface-api | at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189) compreface-api | at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162) compreface-api | at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:201) compreface-api | at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117) compreface-api | at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189) compreface-api | at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162) compreface-api | at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:197) compreface-api | at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:97) compreface-api | at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:541) compreface-api | at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:135) compreface-api | at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92) compreface-api | at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:78) compreface-api | at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:360) compreface-api | at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:399) compreface-api | at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65) compreface-api | at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:890) compreface-api | at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1743) compreface-api | at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49) compreface-api | at org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1191) compreface-api | at org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659) compreface-api | at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) compreface-api | at java.base/java.lang.Thread.run(Unknown Source)