Open pweglik opened 1 week ago
how exactly do you invoke the model?
on a separate note, have you considered switching to Gemini?
We use load_summarize_chain
with either stuff
or map_reduce
. Then we call invoke()
on those chains.
We're happy with PaLM2 for our use case here
Environment details
python:3.11
google-auth
version: 2.29.0Description
We have created a simple Flask server and deployed it to GCP as Cloud Run Service. We are also using few other dependencies:
snippet of the code:
We don't do anything more sophisticated than that. After deployment, it ran fine for few hours and then we started to received warnings:
You can see that there are two fields named
x-goog-api-client
and one is growing out of proportion. Later on it grows even bigger and we started to received it on almost every request. The server also started to timeout as it was unable to serve those requests.It looks like something is appended to this field and it overflows after some time. I found a place in the copde of the library tha could cause it: https://github.com/googleapis/google-auth-library-python/blob/main/google/auth/metrics.py#L138-L154
I'm looking for some guidance what could cause such warning and overflow in requests.
Steps to reproduce
I'm not really sure, error only occurred after few hours (serving few thousands requests)
I have also created this issue in google auth library repo, but maybe someone here will be able to help.
Let me know if I can help you somehow or provide any additional info!