triton-inference-server / tensorrtllm_backend

The Triton TensorRT-LLM Backend
Apache License 2.0
664 stars 96 forks source link

Replace subprocess.Popen with subprocess.run #452

Open rlempka opened 4 months ago

rlempka commented 4 months ago

The use of Popencreates a non-blocking subprocess. When launching Docker containers in detached mode (-d flag) this causes the container to start and then immediately stop, since the main thread within the container has completed all tasks. In addition, there are no logs or errors generated upon exit.

As it is very common to launch containers in detached mode and the launch_triton_server.py may be specified as the entry point, using run follows Docker best practices by having tritonserver run as the main process (https://docs.docker.com/config/containers/multi-service_container/).