Add support for request caching

What feature would you like to be added?

Implement a caching mechanism for LLM API calls to reduce unnecessary API calls, similar to that of in 0.2.

When enabled, this feature should allow us to retrieve cached responses for identical LLM requests instead of making new API calls. Ideal to include a configuration flag to enable/disable caching as well as for managing the cache.

We don't need to follow the same API as in the 0.2 version. We can have the cache managed by the model client instead.

Why is this needed?

Save cost on identical inference requests.

microsoft / autogen

Add support for request caching #3637

What feature would you like to be added?

Why is this needed?