triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.11k stars 1.46k forks source link

Dynamically Limit Endpoint Access #7183

Open amoosebitmymom opened 4 months ago

amoosebitmymom commented 4 months ago

Is your feature request related to a problem? Please describe. When creating new API keys / modifying the permissions of a current one, I can't afford to rollout my deployment. Currently there is no other choice, because the API keys are read and configured at the entrypoint.

Describe the solution you'd like Defining a path to a file which contains a mapping between API keys, names, and permissions. The keys are the restricted key and the values are the restricted values. The permissions are the endpoints that can be restricted. This file can be a mounted Kubernetes secret, thus can be hot swapped. The HTTP and the GRPC servers would have to read the file in-time in order to notice changes made to it.

Describe alternatives you've considered Wrapping the Triton server with our own proxy that uses a mounted secret. However, this would require updating the endpoints with every update done to Triton, instead of having a seamless experience

nnshah1 commented 4 months ago

Instead of wrapping the server - can you place a reverse proxy in front of your server instance?

The endpoint restrictions were not designed to replace a full featured solution with key management (revoke, change permissions, etc.) in mind. Our initial inclination is not to add additional features but encourage use of api gateways designed with this in mind.

Are there down sides to keeping such key management outside of the triton server itself?

amoosebitmymom commented 4 months ago

The only downside of keeping it outside of Triton, is that every engineer will have to take into account endpoint changes instead of it being built-in to the server. So if an update introduces a new set of endpoints, it is up to every single manager to update his gateway after updating the server, instead of it already being taken care of as part of the update.