Open Manouchehri opened 2 months ago
i don't see how you've setup caching. can you share that too?
litellm_settings:
drop_params: True
cache: True
cache_params:
type: s3
s3_bucket_name: os.environ/CACHING_S3_BUCKET_NAME
s3_region_name: os.environ/CACHING_AWS_DEFAULT_REGION
s3_aws_access_key_id: os.environ/CACHING_AWS_ACCESS_KEY_ID
s3_aws_secret_access_key: os.environ/CACHING_AWS_SECRET_ACCESS_KEY
s3_endpoint_url: os.environ/CACHING_AWS_ENDPOINT_URL_S3
failure_callback: ["sentry", "langfuse"]
num_retries_per_request: 3
success_callback: ["langfuse", "s3"]
s3_callback_params:
s3_bucket_name: os.environ/LOGGING_S3_BUCKET_NAME
s3_region_name: os.environ/LOGGING_AWS_DEFAULT_REGION
s3_aws_access_key_id: os.environ/LOGGING_AWS_ACCESS_KEY_ID
s3_aws_secret_access_key: os.environ/LOGGING_AWS_SECRET_ACCESS_KEY
s3_endpoint_url: os.environ/LOGGING_AWS_ENDPOINT_URL_S3
default_team_settings:
- team_id: david_dev
success_callback: ["langfuse", "s3"]
langfuse_secret: os.environ/LANGFUSE_PRIVATE_KEY_DAVID
langfuse_public_key: os.environ/LANGFUSE_PUBLIC_KEY_DAVID
general_settings:
master_key: os.environ/LITELLM_MASTER_KEY
database_url: os.environ/DATABASE_URL
database_connection_pool_limit: 1
disable_spend_logs: True
router_settings:
routing_strategy: simple-shuffle
environment_variables:
model_list:
- model_name: gemini-1.5-pro-preview-0409
litellm_params:
model: vertex_ai/gemini-1.5-pro-preview-0409
vertex_project: litellm-epic
vertex_location: northamerica-northeast1
max_tokens: 8192
- model_name: gemini-1.5-pro-preview-0409
litellm_params:
model: vertex_ai/gemini-1.5-pro-preview-0409
vertex_project: litellm-epic
vertex_location: southamerica-east1
max_tokens: 8192
I am using a key that belongs to david_dev
.
i believe we have some testing on this. will look into this more
@Manouchehri would help if you could add any bugs you believe we should prioritize to this week's bug bash - https://github.com/BerriAI/litellm/issues/3045
Heading to bed atm, will do tomorrow! Thank you! This one and the s3 team logging are the two highest priorities for me for sure.
Do you want me to maybe create github issue labels for low, medium, high, and critical priorities? That's what my team does for our internal projects. 😀
This is still a bug btw, checked today.
What happened?
Caching does not seem to working with this PoC:
Caching is working with this:
Note: if you run the non-streaming script, then the streaming script will successfully use the cache.
Relevant log output
No response
Twitter / LinkedIn details
https://www.linkedin.com/in/davidmanouchehri/