Why there aren't generate and generate_stream api in http client?

triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

BSD 3-Clause "New" or "Revised" License

8.39k stars 1.49k forks source link

Is your feature request related to a problem? Please describe.

I'm wondering why there aren't generate and generate_stream api in http client?
If I want to use the generate api in C++, should I add it to http client class?

How can I chat with LLM using http client?

Is there a demo?

Describe the solution you'd like

I want genetate and generate_stream api in C language.

Describe alternatives you've considered

Any alternatives will do. I just want to use triton(self built, triton client communicates with tritonserver) to chat with LLM.

Additional context

Nothing.

Thanks in advance:)

triton-inference-server / server

Why there aren't generate and generate_stream api in http client? #7602