janhq / cortex.cpp

Local AI API Platform
https://cortex.so
Apache License 2.0
2.05k stars 116 forks source link

feat: Allow transforming sse chunk response #568

Closed grafail closed 3 weeks ago

grafail commented 6 months ago

Problem Currently it is more complicated handling sse output from custom remote inference engines that do not follow the OpenAI spec

Success Criteria Ability to transform the response similar to how "transformResponse" is used in non streaming cases

Additional context Perhaps a function could be passed and used around this line to customise how an sse chunk is handled

https://github.com/janhq/jan/blob/0bad1a479f642c8c5183017641656d52c4b33c61/core/src/browser/extensions/engines/helpers/sse.ts#L74

louis-jan commented 5 months ago

@grafail how do you see a transformResponse being embedded into a request? I see this as a client responsibility, where the SSE protocol is more or less just a communication method. We are working on client SDKs in both JS and Python, so integration could be easier in this case.

grafail commented 5 months ago

I was thinking about this in the same way that transformResponse is used to make other LLMs compatible when not streaming (e.g. https://github.com/janhq/jan/blob/0cae7c97612ba4f6a3387f7f72c56e065711cacf/extensions/inference-anthropic-extension/src/index.ts#L87). Effectively a way to adjust the format of my sse response to the one Jan is expecting without having to rewrite the entire requestInference logic.

louis-jan commented 2 months ago

Feature request: I'm not sure how we should improve the UX to enable this feature. Also, I don't know if that is part of our product vision.

0xSage commented 3 weeks ago

no longer relevant