pytorch / serve

Serve, optimize and scale PyTorch models in production
https://pytorch.org/serve/
Apache License 2.0
4.14k stars 835 forks source link

When batch processing, the delay time is too long. #2005

Open Git-TengSun opened 1 year ago

Git-TengSun commented 1 year ago

question

I deployed a model using torchserve, batchsize=8, workers=1. When eight small images(wxh:721x222) are tested at the same time, torchserve can respond quickly and process them in the same batch. When eight large images(wxh:1369x1824) are tested, torchserve will delay responding for a long time(more than 100 seconds) , although it will be processed in the same batch eventually. I think it's abnormal. I don't know what caused it.

config.properties

enable_envvars_config=true inference_address=http://127.0.0.1:8080 management_address=http://127.0.0.1:8081 metrics_address=http://127.0.0.1:8082 grpc_inference_port=7070 grpc_management_port=7071 number_of_netty_threads=32

netty_client_threads=16

job_queue_size=1000 model_store=./model-store max_response_size=2047093252 max_request_size=2047093252

number_of_gpu=4

enable_metrics_api=false

prefer_direct_buffer=true

async_logging=true

decode_input_request=false

models={ "psenet": {\ "1.0": {\ "defaultVersion": true,\ "marName": "psenet.mar",\ "minWorkers": 1,\ "maxWorkers": 1,\ "batchSize": 8,\ "maxBatchDelay": 100,\ "responseTimeout": 1200\ }\ } }

Versions

TorchServe Version is 0.6.0,installed via conda. Environment headers Torchserve branch:

torchserve==0.6.0 torch-model-archiver==0.6.0

Python version: 3.8 (64-bit runtime) Python executable: /zhangfei401/346/envs/torchserve/bin/python

Versions of relevant python libraries: captum==0.5.0 future==0.18.2 numpy==1.20.3 nvgpu==0.9.0 psutil==5.9.2 pycocotools-fix-numpy==1.0.0 requests==2.28.1 torch==1.8.1+cu101 torch-model-archiver==0.6.0 torch-workflow-archiver==0.2.4 torchaudio==0.8.1 torchserve==0.6.0 torchvision==0.9.1+cu101 wheel==0.37.1 torch==1.8.1+cu101 **Warning: torchtext not present .. torchvision==0.9.1+cu101 torchaudio==0.8.1

Java Version: openjdk version "11.0.17" 2022-10-18 OpenJDK Runtime Environment (build 11.0.17+8-post-Ubuntu-1ubuntu218.04) OpenJDK 64-Bit Server VM (build 11.0.17+8-post-Ubuntu-1ubuntu218.04, mixed mode, sharing)

OS: Ubuntu 18.04.5 LTS GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 Clang version: N/A CMake version: version 3.10.2

Is CUDA available: Yes CUDA runtime version: 11.0.221 GPU models and configuration: GPU 0: Tesla V100-SXM2-32GB GPU 1: Tesla V100-SXM2-32GB GPU 2: Tesla V100-SXM2-32GB GPU 3: Tesla V100-SXM2-32GB Nvidia driver version: 450.102.04 cuDNN version: Probably one of the following: /usr/lib/x86_64-linux-gnu/libcudnn.so.8.0.4 /usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.0.4 /usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.0.4 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.0.4 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.0.4 /usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.0.4 /usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.0.4

msaroufim commented 1 year ago

Two important parameters to tune

  1. "responseTimeout": 1200 this will be a 1,200s timeout - would suggest drastically reducing this and see if anything changes
  2. "maxBatchDelay": 100, this will wait up to 100s for your batch to be filled out, it might be the case that sending the images over the wire is taking too long which is why you're consistently seeing results of 100s. Try reducing this or whatever elements are present for the batch to be filled out will be processed
Git-TengSun commented 1 year ago

@msaroufim Thank you for your answer. I tried both, nothing changed, only a timeout error was reported. If I only deal with a large image, it can respond quickly and return the result. I have now updated torchserve to the latest version. Large pictures can be uploaded quickly and model predictions can be made, but it takes a long time to return the predicted results. The result of the prediction is two predicted segmentation maps. If instead of returning two segmentation maps, I return two integer numbers, the return is not delayed. I think the size of the returned content affects the speed of this response, and I don't know how to solve this problem.

Here is my code

image

image

lxning commented 1 year ago

@Git-TengSun could you attach ts_metrics.log and model_metrics.log for the large images test case?

Git-TengSun commented 1 year ago

@lxning A batch of 8 large images was measured, and the result was returned after 215 seconds. here is the log

model_metrics.log

2022-12-02T07:18:39,959 - HandlerTime.Milliseconds:8470.9|#ModelName:psenet,Level:Model|#hostname:7b6ueu51kb46j-0,requestID:ec60760d-98db-47cf-9331-48a207d37017,9f381b65-c854-4c4c-a018-067931df6abf,074fd30f-4b26-43e3-aa82-f400a687456e,0191707f-2b47-43b8-86d6-3d6d0c36dcc9,a706e500-ad20-4808-b027-947ea75b26ca,e90cd441-bea1-40fb-b18b-3e3c440d2e68,15643703-d57c-4033-a49d-bd5b82489ef9,edb20bd7-e121-41eb-a749-fd3c62462c16,timestamp:1669965519 2022-12-02T07:18:39,960 - PredictionTime.Milliseconds:8477.81|#ModelName:psenet,Level:Model|#hostname:7b6ueu51kb46j-0,requestID:ec60760d-98db-47cf-9331-48a207d37017,9f381b65-c854-4c4c-a018-067931df6abf,074fd30f-4b26-43e3-aa82-f400a687456e,0191707f-2b47-43b8-86d6-3d6d0c36dcc9,a706e500-ad20-4808-b027-947ea75b26ca,e90cd441-bea1-40fb-b18b-3e3c440d2e68,15643703-d57c-4033-a49d-bd5b82489ef9,edb20bd7-e121-41eb-a749-fd3c62462c16,timestamp:1669965519

ts_metrics.log

2022-12-02T07:18:17,283 - CPUUtilization.Percent:6.2|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965497 2022-12-02T07:18:17,284 - DiskAvailable.Gigabytes:772.9885711669922|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965497 2022-12-02T07:18:17,285 - DiskUsage.Gigabytes:103.60663223266602|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965497 2022-12-02T07:18:17,285 - DiskUtilization.Percent:11.8|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965497 2022-12-02T07:18:17,286 - GPUMemoryUtilization.Percent:0.0|#Level:Host,device_id:0|#hostname:7b6ueu51kb46j-0,timestamp:1669965497 2022-12-02T07:18:17,286 - GPUMemoryUsed.Megabytes:0|#Level:Host,device_id:0|#hostname:7b6ueu51kb46j-0,timestamp:1669965497 2022-12-02T07:18:17,287 - GPUMemoryUtilization.Percent:0.0|#Level:Host,device_id:1|#hostname:7b6ueu51kb46j-0,timestamp:1669965497 2022-12-02T07:18:17,287 - GPUMemoryUsed.Megabytes:0|#Level:Host,device_id:1|#hostname:7b6ueu51kb46j-0,timestamp:1669965497 2022-12-02T07:18:17,288 - GPUMemoryUtilization.Percent:0.0|#Level:Host,device_id:2|#hostname:7b6ueu51kb46j-0,timestamp:1669965497 2022-12-02T07:18:17,288 - GPUMemoryUsed.Megabytes:0|#Level:Host,device_id:2|#hostname:7b6ueu51kb46j-0,timestamp:1669965497 2022-12-02T07:18:17,288 - GPUMemoryUtilization.Percent:0.0|#Level:Host,device_id:3|#hostname:7b6ueu51kb46j-0,timestamp:1669965497 2022-12-02T07:18:17,289 - GPUMemoryUsed.Megabytes:0|#Level:Host,device_id:3|#hostname:7b6ueu51kb46j-0,timestamp:1669965497 2022-12-02T07:18:17,289 - GPUUtilization.Percent:0|#Level:Host,device_id:0|#hostname:7b6ueu51kb46j-0,timestamp:1669965497 2022-12-02T07:18:17,290 - GPUUtilization.Percent:0|#Level:Host,device_id:1|#hostname:7b6ueu51kb46j-0,timestamp:1669965497 2022-12-02T07:18:17,290 - GPUUtilization.Percent:0|#Level:Host,device_id:2|#hostname:7b6ueu51kb46j-0,timestamp:1669965497 2022-12-02T07:18:17,290 - GPUUtilization.Percent:0|#Level:Host,device_id:3|#hostname:7b6ueu51kb46j-0,timestamp:1669965497 2022-12-02T07:18:17,291 - MemoryAvailable.Megabytes:345489.984375|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965497 2022-12-02T07:18:17,291 - MemoryUsed.Megabytes:34736.36328125|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965497 2022-12-02T07:18:17,292 - MemoryUtilization.Percent:10.4|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965497 2022-12-02T07:18:22,014 - W-9000-psenet_1.0.ms:6539|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965502 2022-12-02T07:18:22,014 - WorkerThreadTime.ms:32|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965502 2022-12-02T07:19:11,125 - CPUUtilization.Percent:11.1|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965551 2022-12-02T07:19:11,126 - DiskAvailable.Gigabytes:772.9891738891602|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965551 2022-12-02T07:19:11,126 - DiskUsage.Gigabytes:103.60602951049805|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965551 2022-12-02T07:19:11,127 - DiskUtilization.Percent:11.8|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965551 2022-12-02T07:19:11,127 - GPUMemoryUtilization.Percent:4.1156567210089205|#Level:Host,device_id:0|#hostname:7b6ueu51kb46j-0,timestamp:1669965551 2022-12-02T07:19:11,128 - GPUMemoryUsed.Megabytes:1338|#Level:Host,device_id:0|#hostname:7b6ueu51kb46j-0,timestamp:1669965551 2022-12-02T07:19:11,128 - GPUMemoryUtilization.Percent:38.9664718548139|#Level:Host,device_id:1|#hostname:7b6ueu51kb46j-0,timestamp:1669965551 2022-12-02T07:19:11,129 - GPUMemoryUsed.Megabytes:12668|#Level:Host,device_id:1|#hostname:7b6ueu51kb46j-0,timestamp:1669965551 2022-12-02T07:19:11,129 - GPUMemoryUtilization.Percent:0.009227929867733005|#Level:Host,device_id:2|#hostname:7b6ueu51kb46j-0,timestamp:1669965551 2022-12-02T07:19:11,130 - GPUMemoryUsed.Megabytes:3|#Level:Host,device_id:2|#hostname:7b6ueu51kb46j-0,timestamp:1669965551 2022-12-02T07:19:11,130 - GPUMemoryUtilization.Percent:0.009227929867733005|#Level:Host,device_id:3|#hostname:7b6ueu51kb46j-0,timestamp:1669965551 2022-12-02T07:19:11,131 - GPUMemoryUsed.Megabytes:3|#Level:Host,device_id:3|#hostname:7b6ueu51kb46j-0,timestamp:1669965551 2022-12-02T07:19:11,131 - GPUUtilization.Percent:0|#Level:Host,device_id:0|#hostname:7b6ueu51kb46j-0,timestamp:1669965551 2022-12-02T07:19:11,131 - GPUUtilization.Percent:0|#Level:Host,device_id:1|#hostname:7b6ueu51kb46j-0,timestamp:1669965551 2022-12-02T07:19:11,132 - GPUUtilization.Percent:0|#Level:Host,device_id:2|#hostname:7b6ueu51kb46j-0,timestamp:1669965551 2022-12-02T07:19:11,132 - GPUUtilization.Percent:0|#Level:Host,device_id:3|#hostname:7b6ueu51kb46j-0,timestamp:1669965551 2022-12-02T07:19:11,133 - MemoryAvailable.Megabytes:339464.21875|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965551 2022-12-02T07:19:11,133 - MemoryUsed.Megabytes:40739.76953125|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965551 2022-12-02T07:19:11,134 - MemoryUtilization.Percent:11.9|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965551 2022-12-02T07:20:05,059 - CPUUtilization.Percent:6.2|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965605 2022-12-02T07:20:05,060 - DiskAvailable.Gigabytes:772.9886322021484|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965605 2022-12-02T07:20:05,060 - DiskUsage.Gigabytes:103.60657119750977|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965605 2022-12-02T07:20:05,061 - DiskUtilization.Percent:11.8|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965605 2022-12-02T07:20:05,061 - GPUMemoryUtilization.Percent:4.1156567210089205|#Level:Host,device_id:0|#hostname:7b6ueu51kb46j-0,timestamp:1669965605 2022-12-02T07:20:05,062 - GPUMemoryUsed.Megabytes:1338|#Level:Host,device_id:0|#hostname:7b6ueu51kb46j-0,timestamp:1669965605 2022-12-02T07:20:05,062 - GPUMemoryUtilization.Percent:38.9664718548139|#Level:Host,device_id:1|#hostname:7b6ueu51kb46j-0,timestamp:1669965605 2022-12-02T07:20:05,063 - GPUMemoryUsed.Megabytes:12668|#Level:Host,device_id:1|#hostname:7b6ueu51kb46j-0,timestamp:1669965605 2022-12-02T07:20:05,063 - GPUMemoryUtilization.Percent:0.009227929867733005|#Level:Host,device_id:2|#hostname:7b6ueu51kb46j-0,timestamp:1669965605 2022-12-02T07:20:05,063 - GPUMemoryUsed.Megabytes:3|#Level:Host,device_id:2|#hostname:7b6ueu51kb46j-0,timestamp:1669965605 2022-12-02T07:20:05,064 - GPUMemoryUtilization.Percent:0.009227929867733005|#Level:Host,device_id:3|#hostname:7b6ueu51kb46j-0,timestamp:1669965605 2022-12-02T07:20:05,064 - GPUMemoryUsed.Megabytes:3|#Level:Host,device_id:3|#hostname:7b6ueu51kb46j-0,timestamp:1669965605 2022-12-02T07:20:05,065 - GPUUtilization.Percent:0|#Level:Host,device_id:0|#hostname:7b6ueu51kb46j-0,timestamp:1669965605 2022-12-02T07:20:05,065 - GPUUtilization.Percent:0|#Level:Host,device_id:1|#hostname:7b6ueu51kb46j-0,timestamp:1669965605 2022-12-02T07:20:05,065 - GPUUtilization.Percent:0|#Level:Host,device_id:2|#hostname:7b6ueu51kb46j-0,timestamp:1669965605 2022-12-02T07:20:05,066 - GPUUtilization.Percent:0|#Level:Host,device_id:3|#hostname:7b6ueu51kb46j-0,timestamp:1669965605 2022-12-02T07:20:05,066 - MemoryAvailable.Megabytes:339424.01953125|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965605 2022-12-02T07:20:05,067 - MemoryUsed.Megabytes:40779.95703125|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965605 2022-12-02T07:20:05,067 - MemoryUtilization.Percent:11.9|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965605 2022-12-02T07:21:05,339 - CPUUtilization.Percent:5.3|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965665 2022-12-02T07:21:05,340 - DiskAvailable.Gigabytes:772.9885902404785|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965665 2022-12-02T07:21:05,587 - DiskUsage.Gigabytes:103.60661315917969|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965665 2022-12-02T07:21:05,587 - DiskUtilization.Percent:11.8|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965665 2022-12-02T07:21:05,588 - GPUMemoryUtilization.Percent:4.1156567210089205|#Level:Host,device_id:0|#hostname:7b6ueu51kb46j-0,timestamp:1669965665 2022-12-02T07:21:05,588 - GPUMemoryUsed.Megabytes:1338|#Level:Host,device_id:0|#hostname:7b6ueu51kb46j-0,timestamp:1669965665 2022-12-02T07:21:05,589 - GPUMemoryUtilization.Percent:38.9664718548139|#Level:Host,device_id:1|#hostname:7b6ueu51kb46j-0,timestamp:1669965665 2022-12-02T07:21:05,589 - GPUMemoryUsed.Megabytes:12668|#Level:Host,device_id:1|#hostname:7b6ueu51kb46j-0,timestamp:1669965665 2022-12-02T07:21:05,590 - GPUMemoryUtilization.Percent:0.009227929867733005|#Level:Host,device_id:2|#hostname:7b6ueu51kb46j-0,timestamp:1669965665 2022-12-02T07:21:05,590 - GPUMemoryUsed.Megabytes:3|#Level:Host,device_id:2|#hostname:7b6ueu51kb46j-0,timestamp:1669965665 2022-12-02T07:21:05,591 - GPUMemoryUtilization.Percent:0.009227929867733005|#Level:Host,device_id:3|#hostname:7b6ueu51kb46j-0,timestamp:1669965665 2022-12-02T07:21:05,591 - GPUMemoryUsed.Megabytes:3|#Level:Host,device_id:3|#hostname:7b6ueu51kb46j-0,timestamp:1669965665 2022-12-02T07:21:05,592 - GPUUtilization.Percent:0|#Level:Host,device_id:0|#hostname:7b6ueu51kb46j-0,timestamp:1669965665 2022-12-02T07:21:05,592 - GPUUtilization.Percent:0|#Level:Host,device_id:1|#hostname:7b6ueu51kb46j-0,timestamp:1669965665 2022-12-02T07:21:05,593 - GPUUtilization.Percent:0|#Level:Host,device_id:2|#hostname:7b6ueu51kb46j-0,timestamp:1669965665 2022-12-02T07:21:05,593 - GPUUtilization.Percent:0|#Level:Host,device_id:3|#hostname:7b6ueu51kb46j-0,timestamp:1669965665 2022-12-02T07:21:05,594 - MemoryAvailable.Megabytes:339397.859375|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965665 2022-12-02T07:21:05,594 - MemoryUsed.Megabytes:40805.83203125|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965665 2022-12-02T07:21:05,595 - MemoryUtilization.Percent:11.9|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965665 2022-12-02T07:21:59,135 - CPUUtilization.Percent:5.6|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965719 2022-12-02T07:21:59,136 - DiskAvailable.Gigabytes:772.9901466369629|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965719 2022-12-02T07:21:59,137 - DiskUsage.Gigabytes:103.60505676269531|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965719 2022-12-02T07:21:59,137 - DiskUtilization.Percent:11.8|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965719 2022-12-02T07:21:59,138 - GPUMemoryUtilization.Percent:4.1156567210089205|#Level:Host,device_id:0|#hostname:7b6ueu51kb46j-0,timestamp:1669965719 2022-12-02T07:21:59,138 - GPUMemoryUsed.Megabytes:1338|#Level:Host,device_id:0|#hostname:7b6ueu51kb46j-0,timestamp:1669965719 2022-12-02T07:21:59,139 - GPUMemoryUtilization.Percent:38.9664718548139|#Level:Host,device_id:1|#hostname:7b6ueu51kb46j-0,timestamp:1669965719 2022-12-02T07:21:59,139 - GPUMemoryUsed.Megabytes:12668|#Level:Host,device_id:1|#hostname:7b6ueu51kb46j-0,timestamp:1669965719 2022-12-02T07:21:59,140 - GPUMemoryUtilization.Percent:0.009227929867733005|#Level:Host,device_id:2|#hostname:7b6ueu51kb46j-0,timestamp:1669965719 2022-12-02T07:21:59,140 - GPUMemoryUsed.Megabytes:3|#Level:Host,device_id:2|#hostname:7b6ueu51kb46j-0,timestamp:1669965719 2022-12-02T07:21:59,141 - GPUMemoryUtilization.Percent:0.009227929867733005|#Level:Host,device_id:3|#hostname:7b6ueu51kb46j-0,timestamp:1669965719 2022-12-02T07:21:59,141 - GPUMemoryUsed.Megabytes:3|#Level:Host,device_id:3|#hostname:7b6ueu51kb46j-0,timestamp:1669965719 2022-12-02T07:21:59,142 - GPUUtilization.Percent:0|#Level:Host,device_id:0|#hostname:7b6ueu51kb46j-0,timestamp:1669965719 2022-12-02T07:21:59,142 - GPUUtilization.Percent:0|#Level:Host,device_id:1|#hostname:7b6ueu51kb46j-0,timestamp:1669965719 2022-12-02T07:21:59,143 - GPUUtilization.Percent:0|#Level:Host,device_id:2|#hostname:7b6ueu51kb46j-0,timestamp:1669965719 2022-12-02T07:21:59,143 - GPUUtilization.Percent:0|#Level:Host,device_id:3|#hostname:7b6ueu51kb46j-0,timestamp:1669965719 2022-12-02T07:21:59,144 - MemoryAvailable.Megabytes:339131.5078125|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965719 2022-12-02T07:21:59,144 - MemoryUsed.Megabytes:41072.59375|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965719 2022-12-02T07:21:59,145 - MemoryUtilization.Percent:12.0|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965719 2022-12-02T07:22:06,342 - Requests2XX.Count:1|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965511 2022-12-02T07:22:06,345 - QueueTime.ms:0|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965726 2022-12-02T07:22:06,364 - Requests2XX.Count:1|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965511 2022-12-02T07:22:06,366 - QueueTime.ms:0|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965726 2022-12-02T07:22:06,384 - Requests2XX.Count:1|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965511 2022-12-02T07:22:06,385 - QueueTime.ms:0|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965726 2022-12-02T07:22:06,404 - Requests2XX.Count:1|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965511 2022-12-02T07:22:06,405 - QueueTime.ms:0|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965726 2022-12-02T07:22:06,423 - Requests2XX.Count:1|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965511 2022-12-02T07:22:06,424 - QueueTime.ms:0|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965726 2022-12-02T07:22:06,443 - Requests2XX.Count:1|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965511 2022-12-02T07:22:06,444 - QueueTime.ms:0|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965726 2022-12-02T07:22:06,462 - Requests2XX.Count:1|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965511 2022-12-02T07:22:06,463 - QueueTime.ms:0|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965726 2022-12-02T07:22:06,500 - Requests2XX.Count:1|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965511 2022-12-02T07:22:06,501 - QueueTime.ms:0|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965726 2022-12-02T07:22:06,502 - WorkerThreadTime.ms:206|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965726 2022-12-02T07:22:52,934 - CPUUtilization.Percent:4.8|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965772 2022-12-02T07:22:52,935 - DiskAvailable.Gigabytes:772.988883972168|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965772 2022-12-02T07:22:52,936 - DiskUsage.Gigabytes:103.60631942749023|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965772 2022-12-02T07:22:52,936 - DiskUtilization.Percent:11.8|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965772 2022-12-02T07:22:52,937 - GPUMemoryUtilization.Percent:4.1156567210089205|#Level:Host,device_id:0|#hostname:7b6ueu51kb46j-0,timestamp:1669965772 2022-12-02T07:22:52,937 - GPUMemoryUsed.Megabytes:1338|#Level:Host,device_id:0|#hostname:7b6ueu51kb46j-0,timestamp:1669965772 2022-12-02T07:22:52,938 - GPUMemoryUtilization.Percent:38.9664718548139|#Level:Host,device_id:1|#hostname:7b6ueu51kb46j-0,timestamp:1669965772 2022-12-02T07:22:52,938 - GPUMemoryUsed.Megabytes:12668|#Level:Host,device_id:1|#hostname:7b6ueu51kb46j-0,timestamp:1669965772 2022-12-02T07:22:52,938 - GPUMemoryUtilization.Percent:0.009227929867733005|#Level:Host,device_id:2|#hostname:7b6ueu51kb46j-0,timestamp:1669965772 2022-12-02T07:22:52,939 - GPUMemoryUsed.Megabytes:3|#Level:Host,device_id:2|#hostname:7b6ueu51kb46j-0,timestamp:1669965772 2022-12-02T07:22:52,939 - GPUMemoryUtilization.Percent:0.009227929867733005|#Level:Host,device_id:3|#hostname:7b6ueu51kb46j-0,timestamp:1669965772 2022-12-02T07:22:52,940 - GPUMemoryUsed.Megabytes:3|#Level:Host,device_id:3|#hostname:7b6ueu51kb46j-0,timestamp:1669965772 2022-12-02T07:22:52,940 - GPUUtilization.Percent:0|#Level:Host,device_id:0|#hostname:7b6ueu51kb46j-0,timestamp:1669965772 2022-12-02T07:22:52,941 - GPUUtilization.Percent:0|#Level:Host,device_id:1|#hostname:7b6ueu51kb46j-0,timestamp:1669965772 2022-12-02T07:22:52,941 - GPUUtilization.Percent:0|#Level:Host,device_id:2|#hostname:7b6ueu51kb46j-0,timestamp:1669965772 2022-12-02T07:22:52,942 - GPUUtilization.Percent:0|#Level:Host,device_id:3|#hostname:7b6ueu51kb46j-0,timestamp:1669965772 2022-12-02T07:22:52,942 - MemoryAvailable.Megabytes:339126.53125|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965772 2022-12-02T07:22:52,942 - MemoryUsed.Megabytes:41076.89453125|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965772 2022-12-02T07:22:52,943 - MemoryUtilization.Percent:12.0|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965772 2022-12-02T07:23:47,014 - CPUUtilization.Percent:0.0|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965827 2022-12-02T07:23:47,014 - DiskAvailable.Gigabytes:772.8639221191406|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965827 2022-12-02T07:23:47,015 - DiskUsage.Gigabytes:103.73128128051758|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965827 2022-12-02T07:23:47,015 - DiskUtilization.Percent:11.8|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965827 2022-12-02T07:23:47,016 - GPUMemoryUtilization.Percent:4.1156567210089205|#Level:Host,device_id:0|#hostname:7b6ueu51kb46j-0,timestamp:1669965827 2022-12-02T07:23:47,016 - GPUMemoryUsed.Megabytes:1338|#Level:Host,device_id:0|#hostname:7b6ueu51kb46j-0,timestamp:1669965827 2022-12-02T07:23:47,016 - GPUMemoryUtilization.Percent:38.9664718548139|#Level:Host,device_id:1|#hostname:7b6ueu51kb46j-0,timestamp:1669965827 2022-12-02T07:23:47,017 - GPUMemoryUsed.Megabytes:12668|#Level:Host,device_id:1|#hostname:7b6ueu51kb46j-0,timestamp:1669965827 2022-12-02T07:23:47,017 - GPUMemoryUtilization.Percent:0.009227929867733005|#Level:Host,device_id:2|#hostname:7b6ueu51kb46j-0,timestamp:1669965827 2022-12-02T07:23:47,018 - GPUMemoryUsed.Megabytes:3|#Level:Host,device_id:2|#hostname:7b6ueu51kb46j-0,timestamp:1669965827 2022-12-02T07:23:47,018 - GPUMemoryUtilization.Percent:0.009227929867733005|#Level:Host,device_id:3|#hostname:7b6ueu51kb46j-0,timestamp:1669965827 2022-12-02T07:23:47,018 - GPUMemoryUsed.Megabytes:3|#Level:Host,device_id:3|#hostname:7b6ueu51kb46j-0,timestamp:1669965827 2022-12-02T07:23:47,019 - GPUUtilization.Percent:0|#Level:Host,device_id:0|#hostname:7b6ueu51kb46j-0,timestamp:1669965827 2022-12-02T07:23:47,019 - GPUUtilization.Percent:0|#Level:Host,device_id:1|#hostname:7b6ueu51kb46j-0,timestamp:1669965827 2022-12-02T07:23:47,019 - GPUUtilization.Percent:0|#Level:Host,device_id:2|#hostname:7b6ueu51kb46j-0,timestamp:1669965827 2022-12-02T07:23:47,020 - GPUUtilization.Percent:0|#Level:Host,device_id:3|#hostname:7b6ueu51kb46j-0,timestamp:1669965827 2022-12-02T07:23:47,020 - MemoryAvailable.Megabytes:339135.81640625|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965827 2022-12-02T07:23:47,020 - MemoryUsed.Megabytes:41068.29296875|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965827 2022-12-02T07:23:47,021 - MemoryUtilization.Percent:12.0|#Level:Host|#hostname:7b6ueu51kb46j-0,timestamp:1669965827

lxning commented 1 year ago

@Git-TengSun Thank you for the metrics log.

  1. According to log, we can see that it took about 8470.9 + 215 + 206 = 8891.9ms to process this batch and send response back.
  1. You can calculate the 8 requests network transferring time by the steps:

    • record your client send requests the timestamp X
    • find the timestamp Y of the 8 requests received by TorchServe in ts_log.log "Flushing req. to backend at".
    • network transferring time Z = Y - X.
  2. Check 8891.9ms + Z vs 215 seconds.