go-skynet / go-llama.cpp

LLama.cpp golang bindings
MIT License
650 stars 79 forks source link

Streaming responses #205

Closed JonHolman closed 11 months ago

JonHolman commented 11 months ago

Has anyone used this package to stream the responses back to the requestor as llama.cpp generates the response? If not, any tips to achieving that? Thank you.

synw commented 11 months ago

Yes: I use server sent events in my minimalist inference server project. Check the code here:

mudler commented 11 months ago

the example in the go-llama.cpp repository shows how to stream responses: https://github.com/go-skynet/go-llama.cpp/blob/0f3da8c646b780d865264aa76403d48763ace289/examples/main.go#L49C65-L49C65

Closing this issue - feel free to open up a discussion as there is nothing to do