facebookresearch / AnimatedDrawings

Code to accompany "A Method for Animating Children's Drawings of the Human Figure"
MIT License
10.31k stars 885 forks source link

Keep getting curl: (56) Recv failure: Connection reset by peer when pinging #291

Closed Tigran01 closed 1 week ago

Tigran01 commented 1 week ago

I keep getting curl: (56) Recv failure: Connection reset by peer.

Changed the memory/CPU in Docker, but still encounter it. Any idea what can cause this?

Screenshot 2024-06-25 at 3 11 30 AM
Tigran01 commented 1 week ago

When I added memory in the command (explicitly setting a memory for the container), it started to work in the beginning, but then again the same error appeared.

(animated_drawings) tigran@Tigrans-MacBook-Air torchserve % docker run -d --name docker_torchserve -p 8080:8080 -p 8081:8081 --memory=6g docker_torchserve 304d1332e8c21b23d47ed2942f6732ee2e1de7e09e0b725f0a4be81d6fd93b2f (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping { "status": "Healthy" } (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping { "status": "Healthy" } (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping { "status": "Healthy" } (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping { "status": "Healthy" } (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping { "status": "Healthy" } (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping { "status": "Healthy" } (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping { "status": "Healthy" } (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping { "status": "Healthy" } (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping { "status": "Healthy" } (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping { "status": "Healthy" } (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping { "status": "Healthy" } (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping { "status": "Healthy" } (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping { "status": "Healthy" } (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping { "status": "Healthy" } (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping { "status": "Healthy" } (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping { "status": "Healthy" } (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping { "status": "Healthy" } (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping { "status": "Healthy" } (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping { "status": "Healthy" } (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping { "status": "Healthy" } (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping { "status": "Healthy" } (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping curl: (52) Empty reply from server (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping curl: (56) Recv failure: Connection reset by peer (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping curl: (56) Recv failure: Connection reset by peer (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping curl: (56) Recv failure: Connection reset by peer (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping curl: (56) Recv failure: Connection reset by peer (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping curl: (56) Recv failure: Connection reset by peer (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping curl: (56) Recv failure: Connection reset by peer (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping curl: (56) Recv failure: Connection reset by peer (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping curl: (56) Recv failure: Connection reset by peer (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping curl: (56) Recv failure: Connection reset by peer (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping curl: (56) Recv failure: Connection reset by peer (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping curl: (56) Recv failure: Connection reset by peer (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping curl: (56) Recv failure: Connection reset by peer (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping curl: (56) Recv failure: Connection reset by peer (animated_drawings) tigran@Tigrans-MacBook-Air torchserve % curl http://localhost:8080/ping curl: (56) Recv failure: Connection reset by peer

Tigran01 commented 1 week ago

Also I think worth mentioning that the issue started when I accidentally uninstalled Docker along with other apps I was intending to uninstall and then reinstalled and set up the whole thing again. Before it was working fine even without the need to pup memory in the command.

hjessmith commented 1 week ago

Thanks for reporting this, @Tigran01. It's useful to mention this happened when you tried to install again. Can you check the torchserve logs inside the Docker container and see if there's anything useful error messages there?

Tigran01 commented 1 week ago

@hjessmith Weirdly enough, the curl: (56) Recv failure: Connection reset by peer doesn't reproduce atm (will add a comment with logs if it does). However, when trying to get an animation, I encounter the "Failed to get bounding box, please check if the 'docker_torchserve' is running and healthy <Response: [503]>". Btw, this was when indeed the docker_torchserver was running healthy, I had checked it before and after. Later on, the status became unhealthy, below I attached the logs for both when the error occured, and of unhealthy state.

ERROR "Failed to get bounding box, please check if the 'docker_torchserve' is running and healthy <Response: [503]>":

2024-06-25T17:56:36,136 [WARN ] W-9013-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from torchvision import datasets, io, models, ops, transforms, utils 2024-06-25T17:56:36,136 [WARN ] W-9013-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/torchvision/models/init.py", line 17, in 2024-06-25T17:56:36,138 [WARN ] W-9013-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from . import detection, optical_flow, quantization, segmentation, video 2024-06-25T17:56:36,138 [WARN ] W-9013-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/torchvision/models/detection/init.py", line 1, in 2024-06-25T17:56:36,139 [WARN ] W-9013-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from .faster_rcnn import * 2024-06-25T17:56:36,140 [WARN ] W-9013-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/torchvision/models/detection/faster_rcnn.py", line 16, in 2024-06-25T17:56:36,142 [WARN ] W-9013-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from .anchor_utils import AnchorGenerator 2024-06-25T17:56:36,144 [WARN ] W-9013-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/torchvision/models/detection/anchor_utils.py", line 10, in 2024-06-25T17:56:36,144 [WARN ] W-9013-drawn_humanoid_detector_1.0-stderr MODEL_LOG - class AnchorGenerator(nn.Module): 2024-06-25T17:56:36,146 [WARN ] W-9013-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/torchvision/models/detection/anchor_utils.py", line 63, in AnchorGenerator 2024-06-25T17:56:36,147 [WARN ] W-9013-drawn_humanoid_detector_1.0-stderr MODEL_LOG - device: torch.device = torch.device("cpu"), 2024-06-25T17:56:36,147 [WARN ] W-9013-drawn_humanoid_detector_1.0-stderr MODEL_LOG - /opt/conda/lib/python3.11/site-packages/torchvision/models/detection/anchor_utils.py:63: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at /root/pytorch/torch/csrc/utils/tensor_numpy.cpp:84.) 2024-06-25T17:56:36,147 [WARN ] W-9013-drawn_humanoid_detector_1.0-stderr MODEL_LOG - device: torch.device = torch.device("cpu"), 2024-06-25T17:56:37,161 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - 2024-06-25T17:56:37,180 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - A module that was compiled using NumPy 1.x cannot be run in 2024-06-25T17:56:37,182 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - NumPy 2.0.0 as it may crash. To support both 1.x and 2.x 2024-06-25T17:56:37,183 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - versions of NumPy, modules must be compiled with NumPy 2.0. 2024-06-25T17:56:37,183 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - Some module may need to rebuild instead e.g. with 'pybind11>=2.12'. 2024-06-25T17:56:37,183 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - 2024-06-25T17:56:37,184 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - If you are a user of the module, the easiest solution will be to 2024-06-25T17:56:37,184 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - downgrade to 'numpy<2' or try to upgrade the affected module. 2024-06-25T17:56:37,184 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - We expect that some modules will need time to support NumPy 2. 2024-06-25T17:56:37,184 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - 2024-06-25T17:56:37,185 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - Traceback (most recent call last): File "/opt/conda/lib/python3.11/site-packages/ts/model_service_worker.py", line 263, in 2024-06-25T17:56:37,185 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - worker.run_server() 2024-06-25T17:56:37,196 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/ts/model_service_worker.py", line 231, in run_server 2024-06-25T17:56:37,196 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - self.handle_connection(cl_socket) 2024-06-25T17:56:37,214 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/ts/model_service_worker.py", line 194, in handle_connection 2024-06-25T17:56:37,214 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - service, result, code = self.load_model(msg) 2024-06-25T17:56:37,214 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/ts/model_service_worker.py", line 131, in load_model 2024-06-25T17:56:37,214 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - service = model_loader.load( 2024-06-25T17:56:37,217 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/ts/model_loader.py", line 108, in load 2024-06-25T17:56:37,217 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - module, function_name = self._load_handler_file(handler) 2024-06-25T17:56:37,226 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/ts/model_loader.py", line 153, in _load_handler_file 2024-06-25T17:56:37,227 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - module = importlib.import_module(module_name) 2024-06-25T17:56:37,228 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/importlib/init.py", line 126, in import_module 2024-06-25T17:56:37,228 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - return _bootstrap._gcd_import(name[level:], package, level) 2024-06-25T17:56:37,229 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/tmp/models/12f801395bdb4fc7b42bd335a4f0dbf8/mmdet_handler.py", line 9, in 2024-06-25T17:56:37,229 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from mmdet.apis import inference_detector, init_detector 2024-06-25T17:56:37,239 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/mmdet/apis/init.py", line 2, in 2024-06-25T17:56:37,239 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from .inference import (async_inference_detector, inference_detector, 2024-06-25T17:56:37,239 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/mmdet/apis/inference.py", line 8, in 2024-06-25T17:56:37,240 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from mmcv.ops import RoIPool 2024-06-25T17:56:37,240 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/mmcv/ops/init.py", line 9, in 2024-06-25T17:56:37,240 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from .carafe import CARAFE, CARAFENaive, CARAFEPack, carafe, carafe_naive 2024-06-25T17:56:37,240 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/mmcv/ops/carafe.py", line 11, in 2024-06-25T17:56:37,240 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from ..cnn import UPSAMPLE_LAYERS, normal_init, xavier_init 2024-06-25T17:56:37,240 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/mmcv/cnn/init.py", line 14, in 2024-06-25T17:56:37,240 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from .builder import MODELS, build_model_from_cfg 2024-06-25T17:56:37,240 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/mmcv/cnn/builder.py", line 2, in 2024-06-25T17:56:37,240 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from ..runner import Sequential 2024-06-25T17:56:37,241 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/mmcv/runner/init.py", line 3, in 2024-06-25T17:56:37,242 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from .base_runner import BaseRunner 2024-06-25T17:56:37,242 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/mmcv/runner/base_runner.py", line 17, in 2024-06-25T17:56:37,242 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from .checkpoint import load_checkpoint 2024-06-25T17:56:37,242 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/mmcv/runner/checkpoint.py", line 17, in 2024-06-25T17:56:37,242 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - import torchvision 2024-06-25T17:56:37,293 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/torchvision/init.py", line 6, in 2024-06-25T17:56:37,315 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from torchvision import datasets, io, models, ops, transforms, utils 2024-06-25T17:56:37,315 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/torchvision/models/init.py", line 17, in 2024-06-25T17:56:37,318 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from . import detection, optical_flow, quantization, segmentation, video 2024-06-25T17:56:37,319 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/torchvision/models/detection/init.py", line 1, in 2024-06-25T17:56:37,336 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from .faster_rcnn import * 2024-06-25T17:56:37,336 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/torchvision/models/detection/faster_rcnn.py", line 16, in 2024-06-25T17:56:37,339 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from .anchor_utils import AnchorGenerator 2024-06-25T17:56:37,344 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/torchvision/models/detection/anchor_utils.py", line 10, in 2024-06-25T17:56:37,345 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - class AnchorGenerator(nn.Module): 2024-06-25T17:56:37,345 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/torchvision/models/detection/anchor_utils.py", line 63, in AnchorGenerator 2024-06-25T17:56:37,346 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - device: torch.device = torch.device("cpu"), 2024-06-25T17:56:37,346 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - /opt/conda/lib/python3.11/site-packages/torchvision/models/detection/anchor_utils.py:63: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at /root/pytorch/torch/csrc/utils/tensor_numpy.cpp:84.) 2024-06-25T17:56:37,346 [WARN ] W-9012-drawn_humanoid_detector_1.0-stderr MODEL_LOG - device: torch.device = torch.device("cpu"), 2024-06-25T17:56:37,659 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - 2024-06-25T17:56:37,683 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - A module that was compiled using NumPy 1.x cannot be run in 2024-06-25T17:56:37,684 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - NumPy 2.0.0 as it may crash. To support both 1.x and 2.x 2024-06-25T17:56:37,684 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - versions of NumPy, modules must be compiled with NumPy 2.0. 2024-06-25T17:56:37,732 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - Some module may need to rebuild instead e.g. with 'pybind11>=2.12'. 2024-06-25T17:56:37,733 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - 2024-06-25T17:56:37,733 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - If you are a user of the module, the easiest solution will be to 2024-06-25T17:56:37,733 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - downgrade to 'numpy<2' or try to upgrade the affected module. 2024-06-25T17:56:37,733 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - We expect that some modules will need time to support NumPy 2. 2024-06-25T17:56:37,733 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - 2024-06-25T17:56:37,734 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - Traceback (most recent call last): File "/opt/conda/lib/python3.11/site-packages/ts/model_service_worker.py", line 263, in 2024-06-25T17:56:37,734 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - worker.run_server() 2024-06-25T17:56:37,735 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/ts/model_service_worker.py", line 231, in run_server 2024-06-25T17:56:37,742 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - self.handle_connection(cl_socket) 2024-06-25T17:56:37,743 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/ts/model_service_worker.py", line 194, in handle_connection 2024-06-25T17:56:37,743 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - service, result, code = self.load_model(msg) 2024-06-25T17:56:37,744 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/ts/model_service_worker.py", line 131, in load_model 2024-06-25T17:56:37,744 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - service = model_loader.load( 2024-06-25T17:56:37,744 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/ts/model_loader.py", line 108, in load 2024-06-25T17:56:37,745 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - module, function_name = self._load_handler_file(handler) 2024-06-25T17:56:37,747 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/ts/model_loader.py", line 153, in _load_handler_file 2024-06-25T17:56:37,747 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - module = importlib.import_module(module_name) 2024-06-25T17:56:37,747 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/importlib/init.py", line 126, in import_module 2024-06-25T17:56:37,747 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - return _bootstrap._gcd_import(name[level:], package, level) 2024-06-25T17:56:37,750 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/tmp/models/12f801395bdb4fc7b42bd335a4f0dbf8/mmdet_handler.py", line 9, in 2024-06-25T17:56:37,750 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from mmdet.apis import inference_detector, init_detector 2024-06-25T17:56:37,750 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/mmdet/apis/init.py", line 2, in 2024-06-25T17:56:37,750 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from .inference import (async_inference_detector, inference_detector, 2024-06-25T17:56:37,750 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/mmdet/apis/inference.py", line 8, in 2024-06-25T17:56:37,750 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from mmcv.ops import RoIPool 2024-06-25T17:56:37,750 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/mmcv/ops/init.py", line 9, in 2024-06-25T17:56:37,751 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from .carafe import CARAFE, CARAFENaive, CARAFEPack, carafe, carafe_naive 2024-06-25T17:56:37,751 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/mmcv/ops/carafe.py", line 11, in 2024-06-25T17:56:37,751 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from ..cnn import UPSAMPLE_LAYERS, normal_init, xavier_init 2024-06-25T17:56:37,751 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/mmcv/cnn/init.py", line 14, in 2024-06-25T17:56:37,751 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from .builder import MODELS, build_model_from_cfg 2024-06-25T17:56:37,752 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/mmcv/cnn/builder.py", line 2, in 2024-06-25T17:56:37,752 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from ..runner import Sequential 2024-06-25T17:56:37,752 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/mmcv/runner/init.py", line 3, in 2024-06-25T17:56:37,752 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from .base_runner import BaseRunner 2024-06-25T17:56:37,752 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/mmcv/runner/base_runner.py", line 17, in 2024-06-25T17:56:37,753 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from .checkpoint import load_checkpoint 2024-06-25T17:56:37,753 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/mmcv/runner/checkpoint.py", line 17, in 2024-06-25T17:56:37,753 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - import torchvision 2024-06-25T17:56:37,755 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/torchvision/init.py", line 6, in 2024-06-25T17:56:37,755 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from torchvision import datasets, io, models, ops, transforms, utils 2024-06-25T17:56:37,756 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/torchvision/models/init.py", line 17, in 2024-06-25T17:56:37,756 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from . import detection, optical_flow, quantization, segmentation, video 2024-06-25T17:56:37,756 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/torchvision/models/detection/init.py", line 1, in 2024-06-25T17:56:37,756 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from .faster_rcnn import 2024-06-25T17:56:37,756 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/torchvision/models/detection/faster_rcnn.py", line 16, in 2024-06-25T17:56:37,756 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - from .anchor_utils import AnchorGenerator 2024-06-25T17:56:37,756 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/torchvision/models/detection/anchor_utils.py", line 10, in 2024-06-25T17:56:37,757 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - class AnchorGenerator(nn.Module): 2024-06-25T17:56:37,757 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/torchvision/models/detection/anchor_utils.py", line 63, in AnchorGenerator 2024-06-25T17:56:37,757 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - device: torch.device = torch.device("cpu"), 2024-06-25T17:56:37,757 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - /opt/conda/lib/python3.11/site-packages/torchvision/models/detection/anchor_utils.py:63: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at /root/pytorch/torch/csrc/utils/tensor_numpy.cpp:84.) 2024-06-25T17:56:37,760 [WARN ] W-9014-drawn_humanoid_detector_1.0-stderr MODEL_LOG - device: torch.device = torch.device("cpu"), 2024-06-25T17:56:37,834 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - 2024-06-25T17:56:37,835 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - A module that was compiled using NumPy 1.x cannot be run in 2024-06-25T17:56:37,835 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - NumPy 2.0.0 as it may crash. To support both 1.x and 2.x 2024-06-25T17:56:37,835 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - versions of NumPy, modules must be compiled with NumPy 2.0. 2024-06-25T17:56:37,837 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - Some module may need to rebuild instead e.g. with 'pybind11>=2.12'. 2024-06-25T17:56:37,839 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - 2024-06-25T17:56:37,839 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - If you are a user of the module, the easiest solution will be to 2024-06-25T17:56:37,839 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - downgrade to 'numpy<2' or try to upgrade the affected module. 2024-06-25T17:56:37,839 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - We expect that some modules will need time to support NumPy 2. 2024-06-25T17:56:37,839 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - 2024-06-25T17:56:37,839 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - Traceback (most recent call last): File "/opt/conda/lib/python3.11/site-packages/ts/model_service_worker.py", line 263, in 2024-06-25T17:56:37,839 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - worker.run_server() 2024-06-25T17:56:37,841 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/ts/model_service_worker.py", line 231, in run_server 2024-06-25T17:56:37,842 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - self.handle_connection(cl_socket) 2024-06-25T17:56:37,842 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/ts/model_service_worker.py", line 194, in handle_connection 2024-06-25T17:56:37,842 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - service, result, code = self.load_model(msg) 2024-06-25T17:56:37,842 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/ts/model_service_worker.py", line 131, in load_model 2024-06-25T17:56:37,842 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - service = model_loader.load( 2024-06-25T17:56:37,842 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/ts/model_loader.py", line 108, in load 2024-06-25T17:56:37,842 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - module, function_name = self._load_handler_file(handler) 2024-06-25T17:56:37,842 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/ts/model_loader.py", line 153, in _load_handler_file 2024-06-25T17:56:37,843 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - module = importlib.import_module(module_name) 2024-06-25T17:56:37,843 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/importlib/init.py", line 126, in import_module 2024-06-25T17:56:37,844 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - return _bootstrap._gcd_import(name[level:], package, level) 2024-06-25T17:56:37,846 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - File "/tmp/models/d883553efc574eb4bfb4820111206f78/mmpose_handler.py", line 8, in 2024-06-25T17:56:37,846 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - from mmpose.apis import (inference_bottom_up_pose_model, 2024-06-25T17:56:37,846 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/mmpose/apis/init.py", line 2, in 2024-06-25T17:56:37,846 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - from .inference import (collect_multi_frames, inference_bottom_up_pose_model, 2024-06-25T17:56:37,847 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/mmpose/apis/inference.py", line 11, in 2024-06-25T17:56:37,847 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - from mmcv.runner import load_checkpoint 2024-06-25T17:56:37,851 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/mmcv/runner/init.py", line 3, in 2024-06-25T17:56:37,852 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - from .base_runner import BaseRunner 2024-06-25T17:56:37,852 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/mmcv/runner/base_runner.py", line 17, in 2024-06-25T17:56:37,853 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - from .checkpoint import load_checkpoint 2024-06-25T17:56:37,857 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/mmcv/runner/checkpoint.py", line 17, in 2024-06-25T17:56:37,861 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - import torchvision 2024-06-25T17:56:37,862 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/torchvision/init.py", line 6, in 2024-06-25T17:56:37,867 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - from torchvision import datasets, io, models, ops, transforms, utils 2024-06-25T17:56:37,877 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/torchvision/models/init.py", line 17, in 2024-06-25T17:56:37,877 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - from . import detection, optical_flow, quantization, segmentation, video 2024-06-25T17:56:37,887 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/torchvision/models/detection/init.py", line 1, in 2024-06-25T17:56:37,887 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - from .faster_rcnn import 2024-06-25T17:56:37,889 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/torchvision/models/detection/faster_rcnn.py", line 16, in 2024-06-25T17:56:37,889 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - from .anchor_utils import AnchorGenerator 2024-06-25T17:56:37,889 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/torchvision/models/detection/anchor_utils.py", line 10, in 2024-06-25T17:56:37,897 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - class AnchorGenerator(nn.Module): 2024-06-25T17:56:37,898 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - File "/opt/conda/lib/python3.11/site-packages/torchvision/models/detection/anchor_utils.py", line 63, in AnchorGenerator 2024-06-25T17:56:37,898 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - device: torch.device = torch.device("cpu"), 2024-06-25T17:56:37,898 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - /opt/conda/lib/python3.11/site-packages/torchvision/models/detection/anchor_utils.py:63: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at /root/pytorch/torch/csrc/utils/tensor_numpy.cpp:84.) 2024-06-25T17:56:37,907 [WARN ] W-9007-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - device: torch.device = torch.device("cpu"), 2024-06-25T17:57:13,501 [ERROR] W-9015-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.WorkerThread - Number or consecutive unsuccessful inference 18 2024-06-25T17:57:13,511 [ERROR] W-9014-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.WorkerThread - Number or consecutive unsuccessful inference 17 2024-06-25T17:57:13,501 [ERROR] W-9004-drawn_humanoid_pose_estimator_1.0 org.pytorch.serve.wlm.WorkerThread - Number or consecutive unsuccessful inference 16 2024-06-25T17:57:14,359 [ERROR] W-9010-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.WorkerThread - Number or consecutive unsuccessful inference 19 2024-06-25T17:57:14,695 [ERROR] W-9000-drawn_humanoid_pose_estimator_1.0 org.pytorch.serve.wlm.WorkerThread - Number or consecutive unsuccessful inference 17 2024-06-25T17:57:13,762 [ERROR] W-9015-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker error org.pytorch.serve.wlm.WorkerInitializationException: Backend worker did not respond in given time at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:242) [model-server.jar:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:829) [?:?] 2024-06-25T17:57:14,709 [ERROR] W-9000-drawn_humanoid_pose_estimator_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker error org.pytorch.serve.wlm.WorkerInitializationException: Backend worker did not respond in given time at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:242) [model-server.jar:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:829) [?:?] 2024-06-25T17:57:14,783 [ERROR] W-9008-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.WorkerThread - Number or consecutive unsuccessful inference 20 2024-06-25T17:57:13,751 [ERROR] W-9004-drawn_humanoid_pose_estimator_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker error org.pytorch.serve.wlm.WorkerInitializationException: Backend worker did not respond in given time at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:242) [model-server.jar:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:829) [?:?] 2024-06-25T17:57:14,693 [ERROR] W-9010-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker error org.pytorch.serve.wlm.WorkerInitializationException: Backend worker did not respond in given time at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:242) [model-server.jar:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:829) [?:?] 2024-06-25T17:57:13,818 [ERROR] W-9014-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker error org.pytorch.serve.wlm.WorkerInitializationException: Backend worker did not respond in given time at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:242) [model-server.jar:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:829) [?:?] 2024-06-25T17:57:14,801 [ERROR] W-9008-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker error org.pytorch.serve.wlm.WorkerInitializationException: Backend worker did not respond in given time at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:242) [model-server.jar:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:829) [?:?] 2024-06-25T17:57:14,827 [WARN ] W-9015-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.BatchAggregator - Load model failed: drawn_humanoid_detector, error: Worker died. 2024-06-25T17:57:14,831 [WARN ] W-9014-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.BatchAggregator - Load model failed: drawn_humanoid_detector, error: Worker died. 2024-06-25T17:57:14,834 [WARN ] W-9004-drawn_humanoid_pose_estimator_1.0 org.pytorch.serve.wlm.BatchAggregator - Load model failed: drawn_humanoid_pose_estimator, error: Worker died. 2024-06-25T17:57:14,840 [WARN ] W-9010-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.BatchAggregator - Load model failed: drawn_humanoid_detector, error: Worker died. 2024-06-25T17:57:14,842 [WARN ] W-9000-drawn_humanoid_pose_estimator_1.0 org.pytorch.serve.wlm.BatchAggregator - Load model failed: drawn_humanoid_pose_estimator, error: Worker died. 2024-06-25T17:57:14,865 [DEBUG] W-9014-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.WorkerThread - W-9014-drawn_humanoid_detector_1.0 State change WORKER_STARTED -> WORKER_STOPPED 2024-06-25T17:57:14,872 [DEBUG] W-9010-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.WorkerThread - W-9010-drawn_humanoid_detector_1.0 State change WORKER_STARTED -> WORKER_STOPPED 2024-06-25T17:57:14,872 [WARN ] W-9010-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.WorkerThread - Auto recovery failed again 2024-06-25T17:57:14,872 [DEBUG] W-9015-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.WorkerThread - W-9015-drawn_humanoid_detector_1.0 State change WORKER_STARTED -> WORKER_STOPPED 2024-06-25T17:57:14,872 [WARN ] W-9015-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.WorkerThread - Auto recovery failed again 2024-06-25T17:57:14,884 [DEBUG] W-9004-drawn_humanoid_pose_estimator_1.0 org.pytorch.serve.wlm.WorkerThread - W-9004-drawn_humanoid_pose_estimator_1.0 State change WORKER_STARTED -> WORKER_STOPPED 2024-06-25T17:57:14,884 [DEBUG] W-9000-drawn_humanoid_pose_estimator_1.0 org.pytorch.serve.wlm.WorkerThread - W-9000-drawn_humanoid_pose_estimator_1.0 State change WORKER_STARTED -> WORKER_STOPPED 2024-06-25T17:57:14,884 [WARN ] W-9004-drawn_humanoid_pose_estimator_1.0 org.pytorch.serve.wlm.WorkerThread - Auto recovery failed again 2024-06-25T17:57:14,884 [WARN ] W-9000-drawn_humanoid_pose_estimator_1.0 org.pytorch.serve.wlm.WorkerThread - Auto recovery failed again 2024-06-25T17:57:14,884 [WARN ] W-9008-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.BatchAggregator - Load model failed: drawn_humanoid_detector, error: Worker died. 2024-06-25T17:57:14,884 [DEBUG] W-9008-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.WorkerThread - W-9008-drawn_humanoid_detector_1.0 State change WORKER_STARTED -> WORKER_STOPPED 2024-06-25T17:57:14,884 [WARN ] W-9008-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.WorkerThread - Auto recovery failed again 2024-06-25T17:57:14,886 [WARN ] W-9014-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.WorkerThread - Auto recovery failed again 2024-06-25T17:57:15,004 [INFO ] epollEventLoopGroup-5-4 org.pytorch.serve.wlm.WorkerThread - 9010 Worker disconnected. WORKER_STOPPED 2024-06-25T17:57:15,004 [INFO ] epollEventLoopGroup-5-2 org.pytorch.serve.wlm.WorkerThread - 9014 Worker disconnected. WORKER_STOPPED 2024-06-25T17:57:15,014 [INFO ] epollEventLoopGroup-5-1 org.pytorch.serve.wlm.WorkerThread - 9000 Worker disconnected. WORKER_STOPPED 2024-06-25T17:57:15,024 [INFO ] epollEventLoopGroup-5-3 org.pytorch.serve.wlm.WorkerThread - 9004 Worker disconnected. WORKER_STOPPED 2024-06-25T17:57:15,019 [INFO ] epollEventLoopGroup-5-7 org.pytorch.serve.wlm.WorkerThread - 9008 Worker disconnected. WORKER_STOPPED 2024-06-25T17:57:15,017 [INFO ] epollEventLoopGroup-5-5 org.pytorch.serve.wlm.WorkerThread - 9015 Worker disconnected. WORKER_STOPPED 2024-06-25T17:57:15,153 [ERROR] W-9006-drawn_humanoid_pose_estimator_1.0 org.pytorch.serve.wlm.WorkerThread - Number or consecutive unsuccessful inference 18 2024-06-25T17:57:15,167 [ERROR] W-9006-drawn_humanoid_pose_estimator_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker error org.pytorch.serve.wlm.WorkerInitializationException: Backend worker did not respond in given time at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:242) [model-server.jar:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:829) [?:?] 2024-06-25T17:57:15,176 [WARN ] W-9006-drawn_humanoid_pose_estimator_1.0 org.pytorch.serve.wlm.BatchAggregator - Load model failed: drawn_humanoid_pose_estimator, error: Worker died. 2024-06-25T17:57:15,176 [DEBUG] W-9006-drawn_humanoid_pose_estimator_1.0 org.pytorch.serve.wlm.WorkerThread - W-9006-drawn_humanoid_pose_estimator_1.0 State change WORKER_STARTED -> WORKER_STOPPED 2024-06-25T17:57:15,177 [WARN ] W-9006-drawn_humanoid_pose_estimator_1.0 org.pytorch.serve.wlm.WorkerThread - Auto recovery failed again 2024-06-25T17:57:15,264 [INFO ] epollEventLoopGroup-5-6 org.pytorch.serve.wlm.WorkerThread - 9006 Worker disconnected. WORKER_STOPPED 2024-06-25T17:57:15,282 [ERROR] W-9003-drawn_humanoid_pose_estimator_1.0 org.pytorch.serve.wlm.WorkerThread - Number or consecutive unsuccessful inference 19 2024-06-25T17:57:15,286 [ERROR] W-9009-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.WorkerThread - Number or consecutive unsuccessful inference 21 2024-06-25T17:57:15,311 [ERROR] W-9009-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker error org.pytorch.serve.wlm.WorkerInitializationException: Backend worker did not respond in given time at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:242) [model-server.jar:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:829) [?:?] 2024-06-25T17:57:15,308 [ERROR] W-9003-drawn_humanoid_pose_estimator_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker error org.pytorch.serve.wlm.WorkerInitializationException: Backend worker did not respond in given time at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:242) [model-server.jar:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:829) [?:?] 2024-06-25T17:57:15,365 [WARN ] W-9009-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.BatchAggregator - Load model failed: drawn_humanoid_detector, error: Worker died. 2024-06-25T17:57:15,365 [DEBUG] W-9009-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.WorkerThread - W-9009-drawn_humanoid_detector_1.0 State change WORKER_STARTED -> WORKER_STOPPED 2024-06-25T17:57:15,365 [WARN ] W-9009-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.WorkerThread - Auto recovery failed again 2024-06-25T17:57:15,377 [WARN ] W-9003-drawn_humanoid_pose_estimator_1.0 org.pytorch.serve.wlm.BatchAggregator - Load model failed: drawn_humanoid_pose_estimator, error: Worker died. 2024-06-25T17:57:15,378 [DEBUG] W-9003-drawn_humanoid_pose_estimator_1.0 org.pytorch.serve.wlm.WorkerThread - W-9003-drawn_humanoid_pose_estimator_1.0 State change WORKER_STARTED -> WORKER_STOPPED 2024-06-25T17:57:15,378 [WARN ] W-9003-drawn_humanoid_pose_estimator_1.0 org.pytorch.serve.wlm.WorkerThread - Auto recovery failed again 2024-06-25T17:57:15,412 [INFO ] epollEventLoopGroup-5-9 org.pytorch.serve.wlm.WorkerThread - 9003 Worker disconnected. WORKER_STOPPED 2024-06-25T17:57:15,410 [INFO ] epollEventLoopGroup-5-8 org.pytorch.serve.wlm.WorkerThread - 9009 Worker disconnected. WORKER_STOPPED 2024-06-25T17:57:16,524 [ERROR] W-9012-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.WorkerThread - Number or consecutive unsuccessful inference 22 2024-06-25T17:57:16,587 [ERROR] W-9012-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker error org.pytorch.serve.wlm.WorkerInitializationException: Backend worker did not respond in given time at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:242) [model-server.jar:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:829) [?:?] 2024-06-25T17:57:16,633 [WARN ] W-9012-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.BatchAggregator - Load model failed: drawn_humanoid_detector, error: Worker died. 2024-06-25T17:57:16,633 [DEBUG] W-9012-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.WorkerThread - W-9012-drawn_humanoid_detector_1.0 State change WORKER_STARTED -> WORKER_STOPPED 2024-06-25T17:57:16,633 [WARN ] W-9012-drawn_humanoid_detector_1.0 org.pytorch.serve.wlm.WorkerThread - Auto recovery failed again 2024-06-25T17:57:16,724 [INFO ] epollEventLoopGroup-5-10 org.pytorch.serve.wlm.WorkerThread - 9012 Worker disconnected. WORKER_STOPPED 2024-06-25T17:57:17,624 [INFO ] W-9010-drawn_humanoid_detector_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9010-drawn_humanoid_detector_1.0-stderr 2024-06-25T17:57:17,615 [INFO ] W-9015-drawn_humanoid_detector_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9015-drawn_humanoid_detector_1.0-stderr 2024-06-25T17:57:17,625 [INFO ] W-9004-drawn_humanoid_pose_estimator_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9004-drawn_humanoid_pose_estimator_1.0-stderr 2024-06-25T17:57:17,641 [INFO ] W-9012-drawn_humanoid_detector_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9012-drawn_humanoid_detector_1.0-stderr 2024-06-25T17:57:17,642 [INFO ] W-9008-drawn_humanoid_detector_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9008-drawn_humanoid_detector_1.0-stdout 2024-06-25T17:57:17,660 [INFO ] W-9004-drawn_humanoid_pose_estimator_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9004-drawn_humanoid_pose_estimator_1.0-stdout 2024-06-25T17:57:17,641 [INFO ] W-9009-drawn_humanoid_detector_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9009-drawn_humanoid_detector_1.0-stderr 2024-06-25T17:57:17,720 [INFO ] W-9014-drawn_humanoid_detector_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9014-drawn_humanoid_detector_1.0-stderr 2024-06-25T17:57:17,646 [INFO ] W-9008-drawn_humanoid_detector_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9008-drawn_humanoid_detector_1.0-stderr 2024-06-25T17:57:17,738 [INFO ] W-9012-drawn_humanoid_detector_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9012-drawn_humanoid_detector_1.0-stdout 2024-06-25T17:57:17,816 [INFO ] W-9014-drawn_humanoid_detector_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9014-drawn_humanoid_detector_1.0-stdout 2024-06-25T17:57:17,837 [INFO ] W-9003-drawn_humanoid_pose_estimator_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9003-drawn_humanoid_pose_estimator_1.0-stdout 2024-06-25T17:57:18,106 [INFO ] W-9006-drawn_humanoid_pose_estimator_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9006-drawn_humanoid_pose_estimator_1.0-stderr 2024-06-25T17:57:17,858 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9000-drawn_humanoid_pose_estimator_1.0-stderr 2024-06-25T17:57:17,866 [INFO ] W-9010-drawn_humanoid_detector_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9010-drawn_humanoid_detector_1.0-stdout 2024-06-25T17:57:18,105 [INFO ] W-9006-drawn_humanoid_pose_estimator_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9006-drawn_humanoid_pose_estimator_1.0-stdout 2024-06-25T17:57:17,873 [INFO ] W-9015-drawn_humanoid_detector_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9015-drawn_humanoid_detector_1.0-stdout 2024-06-25T17:57:17,888 [INFO ] W-9009-drawn_humanoid_detector_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9009-drawn_humanoid_detector_1.0-stdout 2024-06-25T17:57:17,729 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9000-drawn_humanoid_pose_estimator_1.0-stdout 2024-06-25T17:57:17,847 [INFO ] W-9003-drawn_humanoid_pose_estimator_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9003-drawn_humanoid_pose_estimator_1.0-stderr 2024-06-25T17:57:19,013 [ERROR] W-9001-drawn_humanoid_pose_estimator_1.0 org.pytorch.serve.wlm.WorkerThread - Number or consecutive unsuccessful inference 20

UNHEALTHY:88 2024-06-25T18:00:08,991 [INFO ] pool-2-thread-20 ACCESS_LOG - /192.168.65.1:46332 "GET /ping HTTP/1.1" 500 30 2024-06-25T18:00:08,996 [INFO ] pool-2-thread-20 TS_METRICS - Requests5XX.Count:1.0|#Level:Host|#hostname:84dd413809ef,timestamp:1719338408

Tigran01 commented 1 week ago

@hjessmith Not sure if this would help (I am not really that experienced in how this works), my previous set up of AnimatedDrawings (before I had uninstalled Docker) was before the 461fe94825d189aca98f34f8085f3c724cf7be2f commit. I bumped into this issue and discovered then it was fixed and pulled again.

The other difference with this run was the extreme CPU usage, where my computer started so slow down and heat up (I used the same computer previously), and also how quickly the memory was filling up this time. Hopefully, it help. Please, let me know if there's something I can do that would be more helpful.

hjessmith commented 1 week ago

What platform are you trying to run Animated Drawings on? A local machine or something on the cloud? One option is to try avoiding Docker entirely. There's instructions for how to do this with macos on the main readme. Would something like that work for you?

Tigran01 commented 1 week ago

I am running the docker on the local MacOS, but running the actual scripts in the local VM Linux via bridged network. Using Linux, since I am looking for headless rendering. Unfortunately, I can't run docker in the Linux as well due to the architectural limitations of VM machine. Will take a look at the MacOS instructions now, the thing is I specifically want the headless rendering.

Tigran01 commented 1 week ago

@hjessmith Happy to say, that I was finally able to fix the issue. The high CPU usage decreased significantly, when I downgraded the Docker (it's probably specific to my computer), but even then the Failed to get bounding box, please check if the 'docker_torchserve' is running and healthy <Response: [503]>" error was appearing, because of numpy version. From the log glimpse, it seemed like the torchvision==0.15.1 was installing the latest numpy as dependency and therefore the one in setup file was getting overwritten.

I added RUN pip install numpy==1.23.3 (I guess it could also be 1.24.4 as in setup file, but I had previously this version so just sticked with it) in the Dockerfile after the RUN pip install torchvision==0.15.1 # solve torch version problem and it worked just fine.

Tigran01 commented 1 week ago

@hjessmith Confirmed that it's working RUN pip install numpy==1.24.4, and doesn't without it. Can make a pull request if you'd like.

hjessmith commented 1 week ago

That would be great. I'll happily merge that PR.

Tigran01 commented 1 week ago

@hjessmith Great! Created pull request here.