Caching of the OpenAI API responses

markNZed commented 1 year ago

Describe the feature

During development it could be useful to cache OpenAI API responses while keeping behaviors like the incremental returning of results. This might be a proxy of the OpenAI API and maybe another project, like https://github.com/easychen/openai-api-proxy

This can be done at the level of the application using chatgpt-api but things like streaming add some complications.

transitive-bullshit commented 1 year ago

great idea 💯

here's some related notes I had on this.

OpenAI proxy API

COSS / hosted proxy for interacting with the OpenAI API and/or other main AI APIs

problems it tries to solve

caching (seriously POST caching for embeddings & completions alone is huge)

observability

monitoring

privacy (e.g., see https://github.com/cado-security/masked-ai)

general engineering productionization concerns like latency that you guys have already graphed

the ability to have more than 5 keys and to put limits on each individual key

OSS examples

https://github.com/egoist/openai-proxy

https://github.com/6/openai-caching-proxy-worker

https://github.com/cado-security/masked-ai

https://github.com/easychen/openai-api-proxy

companies

[Helicone (YC W23)](https://www.helicone.ai/)

notes

pro: could quickly build an MVP of this using CF workers and something like [Reflare](https://github.com/xiaoyang-sde/reflare)

con: imho this is def a feature, not a platform, and openai will build better first-party support for these use cases over time

same idea but substitute Cohere/Anthropic/etc or building a normalization layer across all of these

if you could manage to get traction w/ this normalization layer, you could eventually replace the third-party APIs with your own first-party APIs over time, but disintermediation would be really tough w/ this sort of thing

related to this idea, there have been a lot of viral Bring-Your-Own-Key demos built on top of OpenAI. I’d love to build a lightweight solution which solves for this use case in a more trustworthy, secure manner, since pasting your OpenAI secret key into random webapps is a security nightmare

transitive-bullshit commented 1 year ago

I'm going to close this issue as out of scope for this repo, but I hope my notes above are useful for anyone looking to add this into their workflow – or anyone who wants to build this type of caching abstraction 🔥

thanks @markNZed 🙏

transitive-bullshit / agentic

Caching of the OpenAI API responses #517

Describe the feature

OpenAI proxy API