DJL Spring Boot Starter performance

In the Docker container virtualized on the company's Linux server djl-bench -e PyTorch -u https://alpha-djl-demos.s3.amazonaws.com/model/djl-blockrunner/pytorch_resnet18.zip -n traced_resnet18 -c 1000 -s 1,3,224,224 result: [INFO ] - Number of inter-op threads is 4 [INFO ] - Number of intra-op threads is 8 [INFO ] - Load PyTorch (2.1.1) in 0.086 ms. [INFO ] - Running Benchmark on: cpu(). Loading: 100% |████████████████████████████████████████| [INFO ] - Model traced_resnet18 loaded in: 1605.037 ms. [INFO ] - Warmup with 2 iteration ... [INFO ] - Warmup latency, min: 115.385 ms, max: 265.870 ms Iteration: 100% |████████████████████████████████████████| [INFO ] - Inference result: [-0.06938132, 0.6169942, -1.9312556 ...] [INFO ] - Throughput: 12.79, completed 1000 iteration in 78192 ms. [INFO ] - Model loading time: 1605.037 ms. [INFO ] - total P50: 39.692 ms, P90: 165.100 ms, P99: 573.895 ms [INFO ] - inference P50: 39.146 ms, P90: 161.749 ms, P99: 572.999 ms [INFO ] - preprocess P50: 0.179 ms, P90: 2.093 ms, P99: 10.505 ms [INFO ] - postprocess P50: 0.107 ms, P90: 0.152 ms, P99: 0.446 ms

However, on another Windows machine, it was run directly. [INFO ] - Number of inter-op threads is 1 [INFO ] - Number of intra-op threads is 1 [INFO ] - Load PyTorch (2.1.1) in 0.019 ms. [INFO ] - Running MultithreadedBenchmark on: [cpu()]. [INFO ] - Multithreading inference with 2 threads. Loading: 100% |========================================| [INFO ] - Model traced_resnet18 loaded in: 1583.521 ms. [INFO ] - Warmup with 2 iteration ... [INFO ] - Warmup latency, min: 37.997 ms, max: 119.212 ms [INFO ] - Completed 100 requests [INFO ] - Inference result: [-0.06938224, 0.616994, -1.9312545 ...] [INFO ] - Throughput: 46.66, completed 100 iteration in 2143 ms. [INFO ] - Model loading time: 1583.521 ms. [INFO ] - total P50: 42.663 ms, P90: 44.797 ms, P99: 55.650 ms [INFO ] - inference P50: 42.545 ms, P90: 44.657 ms, P99: 55.518 ms [INFO ] - preprocess P50: 0.074 ms, P90: 0.092 ms, P99: 0.259 ms [INFO ] - postprocess P50: 0.041 ms, P90: 0.054 ms, P99: 0.346 ms

why ?

deepjavalibrary / djl-spring-boot-starter

DJL Spring Boot Starter performance #39