[Feature]: Allow Redis Semantic caching with custom Embedding models

The Feature

Currently, I notice that the schema for the redis semantic cache enforces a dimension size of 1536. This works well with OpenAI's text-embedding-ada-002 models but fail for any other model.

schema = {
            "index": {
                "name": "litellm_semantic_cache_index",
                "prefix": "litellm",
                "storage_type": "hash",
            },
            "fields": {
                "text": [{"name": "response"}],
                "text": [{"name": "prompt"}],
                "vector": [
                    {
                        "name": "litellm_embedding",
                        "dims": 1536,
                        "distance_metric": "cosine",
                        "algorithm": "flat",
                        "datatype": "float32",
                    }
                ],
            },
        }

Please make the embedding dims configurable from the model-config.yaml file so that a larger range of embedding models deployed with LiteLLM can be used for the cache.

Motivation, pitch

For caching, much smaller embedding models might be preferred due to their cost and speed. If the dims becomes a configurable item, then this opens up much more interesting modes of caching and cache optimization. It should also reduce costs significantly as this would avoid calls to OpenAI every time a new user query is recieved.

Twitter / LinkedIn details

No response

BerriAI / litellm