Can't access served models via grpc

🐛 Describe the bug

I have two models served on a remote server. Accessible via http request without an issue. However, when trying to connect via grpc, I get the error failed to connect to all address

Traceback (most recent call last):
  File "test_scripts/proto/client.py", line 104, in <module>
    infer(get_inference_stub(), args[1], args[2])
  File "test_scripts/proto/client.py", line 29, in infer
    response = stub.Predictions(
  File "/home/mohitm/venv/torch_venv/lib/python3.8/site-packages/grpc/_channel.py", line 946, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/home/mohitm/venv/torch_venv/lib/python3.8/site-packages/grpc/_channel.py", line 849, in _end_unary_response_blocking
    raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "failed to connect to all addresses"
        debug_error_string = "{"created":"@1654087830.773937466","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3217,"referenced_errors":[{"created":"@1654087830.773936299","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":165,"grpc_status":14}]}"

I have generated the pb2 files using the following command:

python -m grpc_tools.protoc --proto_path=test_scripts/proto/ --python_out=test_scripts/proto --grpc_python_out=test_scripts/proto/ test_scripts/proto/inference.proto test_scripts/proto/management.proto

The inference.proto and management.proto were downloaded from https://github.com/pytorch/serve/tree/master/frontend/server/src/main/resources/proto

The following is my docker command:

sudo docker run --gpus all -p 8080:8080 -p 8081:8081 -p 8082:8082 -p 7070:7070 -p 7071:7071 \
--mount type=bind,source=/home/mohitm/repos/torch_serve/model_store/,target=/tmp/models/ \
--mount type=bind,source=/home/mohitm/repos/torch_serve/configs/config.properties,target=/tmp/models/config.properties \
--mount type=bind,source=/home/mohitm/repos/torch_serve/configs/model_2_config.json,target=/tmp/models/mode_2_config.json \
-t pytorch/torchserve:latest-gpu \
torchserve \
--model-store /tmp/models/ \
--models all \
--ts-config /tmp/models/config.properties > logs/test.log

Error logs

The following is my log file

WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
2022-06-01T12:50:52,313 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Initializing plugins manager...
2022-06-01T12:50:52,398 [INFO ] main org.pytorch.serve.ModelServer - 
Torchserve version: 0.5.3
TS Home: /home/venv/lib/python3.8/site-packages
Current directory: /home/model-server
Temp directory: /home/model-server/tmp
Number of GPUs: 1
Number of CPUs: 8
Max heap size: 7912 M
Python executable: /home/venv/bin/python
Config file: /tmp/models/config.properties
Inference address: http://0.0.0.0:8080
Management address: http://0.0.0.0:8081
Metrics address: http://0.0.0.0:8082
Model Store: /tmp/models
Initial Models: all
Log dir: /home/model-server/logs
Metrics dir: /home/model-server/logs
Netty threads: 0
Netty client threads: 0
Default workers per model: 1
Blacklist Regex: N/A
Maximum Response Size: 655350000
Maximum Request Size: 655350000
Limit Maximum Image Pixels: true
Prefer direct buffer: false
Allowed Urls: [file://.*|http(s)?://.*]
Custom python dependency for model allowed: true
Metrics report format: prometheus
Enable metrics API: true
Workflow Store: /tmp/models
Model config: N/A
2022-06-01T12:50:52,404 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager -  Loading snapshot serializer plugin...
2022-06-01T12:50:52,422 [DEBUG] main org.pytorch.serve.ModelServer - Loading models from model store: model_1.mar
2022-06-01T12:50:56,257 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Adding new version 1.0 for model model_1
2022-06-01T12:50:56,258 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Setting default version to 1.0 for model model_1
2022-06-01T12:50:56,258 [INFO ] main org.pytorch.serve.wlm.ModelManager - Model model_1 loaded.
2022-06-01T12:50:56,258 [DEBUG] main org.pytorch.serve.wlm.ModelManager - updateModel: model_1, count: 1
2022-06-01T12:50:56,265 [DEBUG] main org.pytorch.serve.ModelServer - Loading models from model store: model_2.mar
2022-06-01T12:50:56,266 [DEBUG] W-9000-model_1_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/home/venv/bin/python, /home/venv/lib/python3.8/site-packages/ts/model_service_worker.py, --sock-type, unix, --sock-name, /home/model-server/tmp/.ts.sock.9000]
2022-06-01T12:50:56,951 [INFO ] W-9000-model_1_1.0-stdout MODEL_LOG - Listening on port: /home/model-server/tmp/.ts.sock.9000
2022-06-01T12:50:56,952 [INFO ] W-9000-model_1_1.0-stdout MODEL_LOG - [PID]43
2022-06-01T12:50:56,953 [INFO ] W-9000-model_1_1.0-stdout MODEL_LOG - Torch worker started.
2022-06-01T12:50:56,953 [INFO ] W-9000-model_1_1.0-stdout MODEL_LOG - Python runtime: 3.8.0
2022-06-01T12:50:56,953 [DEBUG] W-9000-model_1_1.0 org.pytorch.serve.wlm.WorkerThread - W-9000-model_1_1.0 State change null -> WORKER_STARTED
2022-06-01T12:50:56,964 [INFO ] W-9000-model_1_1.0 org.pytorch.serve.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.ts.sock.9000
2022-06-01T12:50:57,009 [INFO ] W-9000-model_1_1.0-stdout MODEL_LOG - Connection accepted: /home/model-server/tmp/.ts.sock.9000.
2022-06-01T12:50:57,012 [INFO ] W-9000-model_1_1.0 org.pytorch.serve.wlm.WorkerThread - Flushing req. to backend at: 1654087857012
2022-06-01T12:50:57,044 [INFO ] W-9000-model_1_1.0-stdout MODEL_LOG - model_name: model_1, batchSize: 1
2022-06-01T12:51:00,343 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Adding new version 1.0 for model model_2
2022-06-01T12:51:00,343 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Setting default version to 1.0 for model model_2
2022-06-01T12:51:00,343 [INFO ] main org.pytorch.serve.wlm.ModelManager - Model model_2 loaded.
2022-06-01T12:51:00,343 [DEBUG] main org.pytorch.serve.wlm.ModelManager - updateModel: model_2, count: 1
2022-06-01T12:51:00,344 [DEBUG] W-9001-model_2_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/home/venv/bin/python, /home/venv/lib/python3.8/site-packages/ts/model_service_worker.py, --sock-type, unix, --sock-name, /home/model-server/tmp/.ts.sock.9001]
2022-06-01T12:51:00,345 [INFO ] main org.pytorch.serve.ModelServer - Initialize Inference server with: EpollServerSocketChannel.
2022-06-01T12:51:00,355 [INFO ] main org.pytorch.serve.ModelServer - Inference API bind to: http://0.0.0.0:8080
2022-06-01T12:51:00,355 [INFO ] main org.pytorch.serve.ModelServer - Initialize Management server with: EpollServerSocketChannel.
2022-06-01T12:51:00,356 [INFO ] main org.pytorch.serve.ModelServer - Management API bind to: http://0.0.0.0:8081
2022-06-01T12:51:00,356 [INFO ] main org.pytorch.serve.ModelServer - Initialize Metrics server with: EpollServerSocketChannel.
2022-06-01T12:51:00,357 [INFO ] main org.pytorch.serve.ModelServer - Metrics API bind to: http://0.0.0.0:8082
Model server started.
2022-06-01T12:51:00,555 [WARN ] pool-3-thread-1 org.pytorch.serve.metrics.MetricCollector - worker pid is not available yet.
2022-06-01T12:51:00,959 [INFO ] W-9001-model_2_1.0-stdout MODEL_LOG - Listening on port: /home/model-server/tmp/.ts.sock.9001
2022-06-01T12:51:00,960 [INFO ] W-9001-model_2_1.0-stdout MODEL_LOG - [PID]58
2022-06-01T12:51:00,960 [INFO ] W-9001-model_2_1.0-stdout MODEL_LOG - Torch worker started.
2022-06-01T12:51:00,960 [INFO ] W-9001-model_2_1.0-stdout MODEL_LOG - Python runtime: 3.8.0
2022-06-01T12:51:00,960 [DEBUG] W-9001-model_2_1.0 org.pytorch.serve.wlm.WorkerThread - W-9001-model_2_1.0 State change null -> WORKER_STARTED
2022-06-01T12:51:00,961 [INFO ] W-9001-model_2_1.0 org.pytorch.serve.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.ts.sock.9001
2022-06-01T12:51:00,962 [INFO ] W-9001-model_2_1.0-stdout MODEL_LOG - Connection accepted: /home/model-server/tmp/.ts.sock.9001.
2022-06-01T12:51:00,962 [INFO ] W-9001-model_2_1.0 org.pytorch.serve.wlm.WorkerThread - Flushing req. to backend at: 1654087860962
2022-06-01T12:51:00,982 [INFO ] W-9001-model_2_1.0-stdout MODEL_LOG - model_name: model_2, batchSize: 1
2022-06-01T12:51:00,983 [INFO ] W-9001-model_2_1.0-stdout MODEL_LOG - Current dir - ['yolov3-spp.weights', 'model_2_util.py', 'model_2_handler.py', 'model_2_classes.py', 'yolov3-spp.cfg', 'MAR-INF', 'custom_handler.py', '__pycache__', 'model_2.py', 'model_2_config.json', 'model_2_bbox.py']
2022-06-01T12:51:00,990 [INFO ] W-9001-model_2_1.0-stdout MODEL_LOG - Importing directly from model_2
2022-06-01T12:51:00,996 [INFO ] W-9001-model_2_1.0-stdout MODEL_LOG - Successfully imported directly from model_2
2022-06-01T12:51:01,002 [INFO ] W-9001-model_2_1.0-stdout MODEL_LOG - I am inside model_2
2022-06-01T12:51:01,002 [INFO ] W-9001-model_2_1.0-stdout MODEL_LOG - Parsing the model_2 file
2022-06-01T12:51:01,119 [INFO ] pool-3-thread-1 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:b778691b8fd1,timestamp:1654087861
2022-06-01T12:51:01,119 [INFO ] pool-3-thread-1 TS_METRICS - DiskAvailable.Gigabytes:27.81881332397461|#Level:Host|#hostname:b778691b8fd1,timestamp:1654087861
2022-06-01T12:51:01,120 [INFO ] pool-3-thread-1 TS_METRICS - DiskUsage.Gigabytes:34.021541595458984|#Level:Host|#hostname:b778691b8fd1,timestamp:1654087861
2022-06-01T12:51:01,120 [INFO ] pool-3-thread-1 TS_METRICS - DiskUtilization.Percent:55.0|#Level:Host|#hostname:b778691b8fd1,timestamp:1654087861
2022-06-01T12:51:01,120 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:7.518697465087034|#Level:Host,device_id:0|#hostname:b778691b8fd1,timestamp:1654087861
2022-06-01T12:51:01,120 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:1136|#Level:Host,device_id:0|#hostname:b778691b8fd1,timestamp:1654087861
2022-06-01T12:51:01,121 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:7|#Level:Host,device_id:0|#hostname:b778691b8fd1,timestamp:1654087861
2022-06-01T12:51:01,121 [INFO ] pool-3-thread-1 TS_METRICS - MemoryAvailable.Megabytes:27366.5703125|#Level:Host|#hostname:b778691b8fd1,timestamp:1654087861
2022-06-01T12:51:01,121 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUsed.Megabytes:3802.2109375|#Level:Host|#hostname:b778691b8fd1,timestamp:1654087861
2022-06-01T12:51:01,121 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUtilization.Percent:13.5|#Level:Host|#hostname:b778691b8fd1,timestamp:1654087861
2022-06-01T12:51:01,647 [INFO ] W-9000-model_1_1.0 org.pytorch.serve.wlm.WorkerThread - Backend response time: 4603
2022-06-01T12:51:01,648 [DEBUG] W-9000-model_1_1.0 org.pytorch.serve.wlm.WorkerThread - W-9000-model_1_1.0 State change WORKER_STARTED -> WORKER_MODEL_LOADED
2022-06-01T12:51:01,648 [INFO ] W-9000-model_1_1.0 TS_METRICS - W-9000-model_1_1.0.ms:5385|#Level:Host|#hostname:b778691b8fd1,timestamp:1654087861
2022-06-01T12:51:01,648 [INFO ] W-9000-model_1_1.0 TS_METRICS - WorkerThreadTime.ms:33|#Level:Host|#hostname:b778691b8fd1,timestamp:null
2022-06-01T12:51:04,774 [INFO ] W-9001-model_2_1.0 org.pytorch.serve.wlm.WorkerThread - Backend response time: 3792
2022-06-01T12:51:04,774 [DEBUG] W-9001-model_2_1.0 org.pytorch.serve.wlm.WorkerThread - W-9001-model_2_1.0 State change WORKER_STARTED -> WORKER_MODEL_LOADED
2022-06-01T12:51:04,774 [INFO ] W-9001-model_2_1.0 TS_METRICS - W-9001-model_2_1.0.ms:4431|#Level:Host|#hostname:b778691b8fd1,timestamp:1654087864
2022-06-01T12:51:04,775 [INFO ] W-9001-model_2_1.0 TS_METRICS - WorkerThreadTime.ms:21|#Level:Host|#hostname:b778691b8fd1,timestamp:null
2022-06-01T12:52:01,043 [INFO ] pool-3-thread-1 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:b778691b8fd1,timestamp:1654087921
2022-06-01T12:52:01,043 [INFO ] pool-3-thread-1 TS_METRICS - DiskAvailable.Gigabytes:27.818756103515625|#Level:Host|#hostname:b778691b8fd1,timestamp:1654087921
2022-06-01T12:52:01,043 [INFO ] pool-3-thread-1 TS_METRICS - DiskUsage.Gigabytes:34.02159881591797|#Level:Host|#hostname:b778691b8fd1,timestamp:1654087921
2022-06-01T12:52:01,044 [INFO ] pool-3-thread-1 TS_METRICS - DiskUtilization.Percent:55.0|#Level:Host|#hostname:b778691b8fd1,timestamp:1654087921
2022-06-01T12:52:01,044 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:16.15593354954001|#Level:Host,device_id:0|#hostname:b778691b8fd1,timestamp:1654087921
2022-06-01T12:52:01,044 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:2441|#Level:Host,device_id:0|#hostname:b778691b8fd1,timestamp:1654087921
2022-06-01T12:52:01,044 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:0|#hostname:b778691b8fd1,timestamp:1654087921
2022-06-01T12:52:01,044 [INFO ] pool-3-thread-1 TS_METRICS - MemoryAvailable.Megabytes:25259.8125|#Level:Host|#hostname:b778691b8fd1,timestamp:1654087921
2022-06-01T12:52:01,045 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUsed.Megabytes:5896.4765625|#Level:Host|#hostname:b778691b8fd1,timestamp:1654087921
2022-06-01T12:52:01,045 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUtilization.Percent:20.2|#Level:Host|#hostname:b778691b8fd1,timestamp:1654087921
2022-06-01T12:53:01,040 [INFO ] pool-3-thread-1 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:b778691b8fd1,timestamp:1654087981
2022-06-01T12:53:01,040 [INFO ] pool-3-thread-1 TS_METRICS - DiskAvailable.Gigabytes:27.81875228881836|#Level:Host|#hostname:b778691b8fd1,timestamp:1654087981
2022-06-01T12:53:01,040 [INFO ] pool-3-thread-1 TS_METRICS - DiskUsage.Gigabytes:34.021602630615234|#Level:Host|#hostname:b778691b8fd1,timestamp:1654087981
2022-06-01T12:53:01,040 [INFO ] pool-3-thread-1 TS_METRICS - DiskUtilization.Percent:55.0|#Level:Host|#hostname:b778691b8fd1,timestamp:1654087981
2022-06-01T12:53:01,041 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:16.15593354954001|#Level:Host,device_id:0|#hostname:b778691b8fd1,timestamp:1654087981
2022-06-01T12:53:01,041 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:2441|#Level:Host,device_id:0|#hostname:b778691b8fd1,timestamp:1654087981
2022-06-01T12:53:01,041 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:0|#hostname:b778691b8fd1,timestamp:1654087981
2022-06-01T12:53:01,041 [INFO ] pool-3-thread-1 TS_METRICS - MemoryAvailable.Megabytes:25258.07421875|#Level:Host|#hostname:b778691b8fd1,timestamp:1654087981
2022-06-01T12:53:01,041 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUsed.Megabytes:5898.21484375|#Level:Host|#hostname:b778691b8fd1,timestamp:1654087981
2022-06-01T12:53:01,041 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUtilization.Percent:20.2|#Level:Host|#hostname:b778691b8fd1,timestamp:1654087981
2022-06-01T12:54:01,043 [INFO ] pool-3-thread-1 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:b778691b8fd1,timestamp:1654088041
2022-06-01T12:54:01,043 [INFO ] pool-3-thread-1 TS_METRICS - DiskAvailable.Gigabytes:27.818740844726562|#Level:Host|#hostname:b778691b8fd1,timestamp:1654088041
2022-06-01T12:54:01,043 [INFO ] pool-3-thread-1 TS_METRICS - DiskUsage.Gigabytes:34.02161407470703|#Level:Host|#hostname:b778691b8fd1,timestamp:1654088041
2022-06-01T12:54:01,043 [INFO ] pool-3-thread-1 TS_METRICS - DiskUtilization.Percent:55.0|#Level:Host|#hostname:b778691b8fd1,timestamp:1654088041
2022-06-01T12:54:01,043 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:16.15593354954001|#Level:Host,device_id:0|#hostname:b778691b8fd1,timestamp:1654088041
2022-06-01T12:54:01,043 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:2441|#Level:Host,device_id:0|#hostname:b778691b8fd1,timestamp:1654088041
2022-06-01T12:54:01,044 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:0|#hostname:b778691b8fd1,timestamp:1654088041
2022-06-01T12:54:01,044 [INFO ] pool-3-thread-1 TS_METRICS - MemoryAvailable.Megabytes:25265.07421875|#Level:Host|#hostname:b778691b8fd1,timestamp:1654088041
2022-06-01T12:54:01,044 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUsed.Megabytes:5891.21484375|#Level:Host|#hostname:b778691b8fd1,timestamp:1654088041
2022-06-01T12:54:01,044 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUtilization.Percent:20.2|#Level:Host|#hostname:b778691b8fd1,timestamp:1654088041
2022-06-01T12:55:01,043 [INFO ] pool-3-thread-1 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:b778691b8fd1,timestamp:1654088101
2022-06-01T12:55:01,044 [INFO ] pool-3-thread-1 TS_METRICS - DiskAvailable.Gigabytes:27.818737030029297|#Level:Host|#hostname:b778691b8fd1,timestamp:1654088101
2022-06-01T12:55:01,044 [INFO ] pool-3-thread-1 TS_METRICS - DiskUsage.Gigabytes:34.0216178894043|#Level:Host|#hostname:b778691b8fd1,timestamp:1654088101
2022-06-01T12:55:01,044 [INFO ] pool-3-thread-1 TS_METRICS - DiskUtilization.Percent:55.0|#Level:Host|#hostname:b778691b8fd1,timestamp:1654088101
2022-06-01T12:55:01,044 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:16.15593354954001|#Level:Host,device_id:0|#hostname:b778691b8fd1,timestamp:1654088101
2022-06-01T12:55:01,044 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:2441|#Level:Host,device_id:0|#hostname:b778691b8fd1,timestamp:1654088101
2022-06-01T12:55:01,044 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:0|#hostname:b778691b8fd1,timestamp:1654088101
2022-06-01T12:55:01,044 [INFO ] pool-3-thread-1 TS_METRICS - MemoryAvailable.Megabytes:25259.625|#Level:Host|#hostname:b778691b8fd1,timestamp:1654088101
2022-06-01T12:55:01,045 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUsed.Megabytes:5896.6640625|#Level:Host|#hostname:b778691b8fd1,timestamp:1654088101
2022-06-01T12:55:01,045 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUtilization.Percent:20.2|#Level:Host|#hostname:b778691b8fd1,timestamp:1654088101
2022-06-01T12:55:16,242 [INFO ] epollEventLoopGroup-3-1 ACCESS_LOG - /x.x.x:33130 "GET /models HTTP/1.1" 200 4
2022-06-01T12:55:16,243 [INFO ] epollEventLoopGroup-3-1 TS_METRICS - Requests2XX.Count:1|#Level:Host|#hostname:b778691b8fd1,timestamp:null
2022-06-01T12:55:20,589 [INFO ] epollEventLoopGroup-3-2 ACCESS_LOG - /x.x.x.x:33132 "GET /models HTTP/1.1" 200 1
2022-06-01T12:55:20,589 [INFO ] epollEventLoopGroup-3-2 TS_METRICS - Requests2XX.Count:1|#Level:Host|#hostname:b778691b8fd1,timestamp:null
2022-06-01T12:55:48,224 [INFO ] epollEventLoopGroup-3-3 ACCESS_LOG - /x.x.x.x:33134 "GET /models HTTP/1.1" 200 1
2022-06-01T12:55:48,224 [INFO ] epollEventLoopGroup-3-3 TS_METRICS - Requests2XX.Count:1|#Level:Host|#hostname:b778691b8fd1,timestamp:null
2022-06-01T12:55:52,582 [INFO ] epollEventLoopGroup-3-4 ACCESS_LOG - /x.x.x.x:33136 "GET /models HTTP/1.1" 200 0
2022-06-01T12:55:52,582 [INFO ] epollEventLoopGroup-3-4 TS_METRICS - Requests2XX.Count:1|#Level:Host|#hostname:b778691b8fd1,timestamp:null
2022-06-01T12:56:01,045 [INFO ] pool-3-thread-1 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:b778691b8fd1,timestamp:1654088161
2022-06-01T12:56:01,045 [INFO ] pool-3-thread-1 TS_METRICS - DiskAvailable.Gigabytes:27.81871795654297|#Level:Host|#hostname:b778691b8fd1,timestamp:1654088161
2022-06-01T12:56:01,045 [INFO ] pool-3-thread-1 TS_METRICS - DiskUsage.Gigabytes:34.021636962890625|#Level:Host|#hostname:b778691b8fd1,timestamp:1654088161
2022-06-01T12:56:01,045 [INFO ] pool-3-thread-1 TS_METRICS - DiskUtilization.Percent:55.0|#Level:Host|#hostname:b778691b8fd1,timestamp:1654088161
2022-06-01T12:56:01,045 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:16.15593354954001|#Level:Host,device_id:0|#hostname:b778691b8fd1,timestamp:1654088161
2022-06-01T12:56:01,046 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:2441|#Level:Host,device_id:0|#hostname:b778691b8fd1,timestamp:1654088161
2022-06-01T12:56:01,046 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:0|#hostname:b778691b8fd1,timestamp:1654088161
2022-06-01T12:56:01,046 [INFO ] pool-3-thread-1 TS_METRICS - MemoryAvailable.Megabytes:25184.0703125|#Level:Host|#hostname:b778691b8fd1,timestamp:1654088161
2022-06-01T12:56:01,046 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUsed.Megabytes:5972.21875|#Level:Host|#hostname:b778691b8fd1,timestamp:1654088161
2022-06-01T12:56:01,046 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUtilization.Percent:20.4|#Level:Host|#hostname:b778691b8fd1,timestamp:1654088161
2022-06-01T12:56:56,182 [INFO ] epollEventLoopGroup-3-5 ACCESS_LOG - /83.97.20.151:7574 "GET / HTTP/1.1" 405 1
2022-06-01T12:56:56,183 [INFO ] epollEventLoopGroup-3-5 TS_METRICS - Requests4XX.Count:1|#Level:Host|#hostname:b778691b8fd1,timestamp:null
2022-06-01T12:57:01,044 [INFO ] pool-3-thread-1 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:b778691b8fd1,timestamp:1654088221
2022-06-01T12:57:01,045 [INFO ] pool-3-thread-1 TS_METRICS - DiskAvailable.Gigabytes:27.818710327148438|#Level:Host|#hostname:b778691b8fd1,timestamp:1654088221
2022-06-01T12:57:01,045 [INFO ] pool-3-thread-1 TS_METRICS - DiskUsage.Gigabytes:34.021644592285156|#Level:Host|#hostname:b778691b8fd1,timestamp:1654088221
2022-06-01T12:57:01,045 [INFO ] pool-3-thread-1 TS_METRICS - DiskUtilization.Percent:55.0|#Level:Host|#hostname:b778691b8fd1,timestamp:1654088221
2022-06-01T12:57:01,045 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:16.15593354954001|#Level:Host,device_id:0|#hostname:b778691b8fd1,timestamp:1654088221
2022-06-01T12:57:01,045 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:2441|#Level:Host,device_id:0|#hostname:b778691b8fd1,timestamp:1654088221
2022-06-01T12:57:01,045 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:0|#hostname:b778691b8fd1,timestamp:1654088221
2022-06-01T12:57:01,045 [INFO ] pool-3-thread-1 TS_METRICS - MemoryAvailable.Megabytes:25161.79296875|#Level:Host|#hostname:b778691b8fd1,timestamp:1654088221
2022-06-01T12:57:01,046 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUsed.Megabytes:5994.49609375|#Level:Host|#hostname:b778691b8fd1,timestamp:1654088221
2022-06-01T12:57:01,046 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUtilization.Percent:20.5|#Level:Host|#hostname:b778691b8fd1,timestamp:1654088221

Installation instructions

pip install torchserve torch-model-archiver

Model Packaing

model_1 packaging

torch-model-archiver --model-name model_1 --version 1.0 --serialized-file models/bj/model_1.pth --handler handlers/model_1_handler.py --export-path model_store --model-file model_classes/fp/model_1.py --force

model_2 packaging

torch-model-archiver --model-name model_2 --version 1.0 --serialized-file models/bj/model_2.weights --handler handlers/model_2_handler.py --export-path model_store --model-file model_classes/dn/model_2.py --force --extra-files configs/model_2_config.json,model_classes/dn/model_2_bbox.py,model_classes/dn/model_2_util.py,handlers/custom_handler.py,model_classes/dn/model_2_classes.py,models/bj/model_2.cfg

config.properties

max_request_size=655350000
max_response_size=655350000
inference_address=http://0.0.0.0:8080
management_address=http://0.0.0.0:8081
metrics_address=http://0.0.0.0:8082
grpc_inference_port=7070
grpc_management_port=7071
cors_allowed_origin=*
cors_allowed_methods=GET, POST, PUT, OPTIONS
cors_allowed_headers=X-Custom-Header

#install requirements for each model if specified
install_py_dep_per_model=true

Versions

TorchServe Version is 0.5.3

Repro instructions

Created the marfiles
Ran the above docker command
Generated the pb2 files from the said .proto files
copied the client file from https://github.com/pytorch/serve/blob/master/ts_scripts/torchserve_grpc_client.py
Ran the following command python test_scripts/proto/client.py infer model_1 kitten_small.jpg

Possible Solution

No response

pytorch / serve

Can't access served models via grpc #1662