triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.31k stars 1.48k forks source link

Allow introspection and static analysis of `pb_utils` (Python backend) #5961

Open ClaytonJY opened 1 year ago

ClaytonJY commented 1 year ago

Is your feature request related to a problem? Please describe. When writing the model.py file for a Python backend model, it is very difficult to correctly use triton_python_backend_utils (aka pb_utils). I can't install the module locally, but even if I try to develop inside the container, the python module itself does not provide objects like InferenceRequest.

This means I can't do any kind of pre-run checks that my code works, such as with an IDE or a linter. I have no way to discover methods or properties of essential pb_utils objects; if it's not discussed in documentation, I can't know it exists.

Further, these same "missing" objects aren't able to be used at type hints, which is odd.

This is a really unfortunate developer experience, and leads to a lot of back and forth when developing: update model.py, restart server, make request.

While I don't understand the details, I know this is related to the C code. This is somewhat understandable, but other FFI-heavy python code doesn't seem to have these issues, e.g. numpy, PyTorch, etc.

Describe the solution you'd like I would like to be able to

My preference would be for this to be possible outside of the provided containers (a package on PyPI?), but an in-container solution would still be a big improvement. If I could exec into a container, start a python interpreter, and import triton_python_backend_utils as pb_utils then exposed all the functions and classes available at runtime, that'd be huge!

I suspect this involves some additional code to make explicit the C dependencies, but I'm not too sure what that looks like.

Describe alternatives you've considered Because this code is only available within the container, I'm not too sure what else I could do. If I'm missing something that would improve my developer experience here, I'm all ears!

P.S. I'm sorry if this isn't the right place for this; looks like Issues and Discussions have been disabled in the Python Backend repository

kthui commented 1 year ago

Thanks for the developer experience suggestions, this is the right place for enhancement requests. I have filed a ticket for us to investigate further. DLIS-5046

arthur-st commented 1 year ago

I'd like to speak in support of this feature request as well. I had to consider using Triton recently for an internal project, and the dynamic injection of triton_python_backend_utils was a big enough development experience flaw to put it off for now. No functional substance to add to the OP, as it's well-rounded in the requested functionality. However, should you choose to distribute triton_python_backend_utils the Python-standard way, I would suggest enhancing it with documentation for setting up a reference hot reload configuration.

arthur-st commented 1 year ago

Also, #5813 is functionally the same feature request, to have standard distribution channels available for Python.

shahin-nihahs commented 3 months ago

Any progress here? Not knowing what's inside an object is quite unintuitive.