triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.12k stars 1.46k forks source link

Is it possible to make gRPC to use a unix socket instead of TCP in Triton Server? #4095

Open PauloFavero opened 2 years ago

PauloFavero commented 2 years ago

We have a streaming service that uses gRPC with Unix sockets. The gRPC performs way better with Unix socks in comparison with a TCP port. I saw that you can only change the port in the triton server for the gRPC connection.

An option to change the gRPC connection protocol could be viable in the future as a new feature? Is there a way to do that with the current implementation?

Thanks

deadeyegoodwin commented 2 years ago

It should be possible with changes to the server grpc implementation, but we haven't looked at it in detail.

guoxiaojie-schinper commented 2 years ago

I face the same problem. I deploy tritonserve and my application on the same machine, then I want to change the TCP proto to UNIX sock so that it can accelerate the inference speed, but there is no way to change this.

andremoeller commented 2 years ago

Adding a +1 to this enhancement request. @deadeyegoodwin -- wondering if your team has had a chance to scope this out?

sakoush commented 2 years ago

Yes this will be a very useful feature.

kelevra1993 commented 9 months ago

Any updates on this ? Or anyone know how i would be able to start to try to code / implement this. For our current application we might benefit greatly.

guoquan commented 2 months ago

Would be useful for local deployment. Looking forward to it.

phred-unity commented 6 days ago

Chiming in here, would be great if this was looked into!