triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.18k stars 1.46k forks source link

reuse-http-port flag is integer and not boolean #6481

Closed VictorM-PS closed 11 months ago

VictorM-PS commented 11 months ago

Description The command feature reuse-http-port to merge several triton servers in a single port is indicated boolean in the help flag, however it is actually implemented as integer.

Triton Information What version of Triton are you using? 23.04

Are you using the Triton container or did you build it yourself? Triton container with fastertransformer_backend.

To Reproduce

root@ddf7471aa162:/workspace# /opt/tritonserver/bin/tritonserver --allow-http true --model-repository=./models/ner-model/  --load-model=*  --allow-http true --http-port 5023 --reuse-http-port true
terminate called after throwing an instance of 'std::invalid_argument'
  what():  stoi
Aborted (core dumped)

When running it with 1:

root@ddf7471aa162:/workspace# /opt/tritonserver/bin/tritonserver --allow-http true --model-repository=./models/ner-model/  --load-model=*  --allow-http true --http-port 5023 --reuse-http-port=1
I1025 14:02:19.135604 307 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7fa262000000' with size 268435456
I1025 14:02:19.139103 307 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
...

Your test files cover it with the 1 flag but -h documentation is wrong/not clear as it indicates boolean true/false.

Expected behavior The command line should execute with --reuse-http-port true.

dyastremsky commented 11 months ago

Thank you for catching and reporting this bug, as well as detailing how our test files cover it. We have filed a ticket (DLIS-5697) and will fix this.

dyastremsky commented 11 months ago

Thanks for letting us know about this! Closed with #6511.