triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.31k stars 1.48k forks source link

How to enable nsys when starting a Triton server using Python API #7209

Open jerry605 opened 6 months ago

jerry605 commented 6 months ago

Is your feature request related to a problem? Please describe.

Hi team,

we used to use command line to start a Triton server, so it's easy to enable nsys by running command like below

nsys profile tritonserver ...

We recently switched to Python server API, seems this is new practice and recommended for future, we started like below

tritonserver.Server(model_repository=['/mount/data/models'], model_control_mode=tritonserver.ModelControlMode.EXPLICIT, log_info=True, log_warn=True, log_error=True)
    self._triton_server.start(wait_until_ready=True)

We are wondering how we can enbale nsys when using Python server API Thanks a lot. Describe the solution you'd like A clear and concise description of what you want to happen.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context or screenshots about the feature request here.

nnshah1 commented 6 months ago

@jerry605 the In Process Python API is a way to create your own server using the Triton Core and Backends and so is not a direct replacement for the tritonserver REST / gRPC server application. There is no specific recommendation on which is better suited to your use case.

What happens when you use nsys to profile your resulting python module / application?