Open danlenar opened 2 years ago
Oh, it seems that you have a lot of failovers... It is possible to reduce the number of history lines stored in the annotation by using max_timelines_history parameter.
i have the same issue. tried to set max_timelines_history: 10
in patronictl edit-config
. restarted all db pods, even restarted postgres operator pod, delete endpoint config acid-prod-api
also tried to add to database yaml cofig
postgresql:
parameters:
max_timelines_history: "10"
but still getting this error
ERROR: Unexpected error from Kubernetes API
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/patroni/dcs/kubernetes.py", line 483, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/patroni/dcs/kubernetes.py", line 877, in patch_or_create
return self._patch_or_create(name, annotations, resource_version, patch, retry, ips)
File "/usr/local/lib/python3.6/dist-packages/patroni/dcs/kubernetes.py", line 868, in _patch_or_create
ret = retry(func, self._namespace, body) if retry else func(self._namespace, body)
File "/usr/local/lib/python3.6/dist-packages/patroni/dcs/kubernetes.py", line 468, in wrapper
return getattr(self._core_v1_api, func)(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/patroni/dcs/kubernetes.py", line 404, in wrapper
return self._api_client.call_api(method, path, headers, body, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/patroni/dcs/kubernetes.py", line 373, in call_api
return self._handle_server_response(response, _preload_content)
File "/usr/local/lib/python3.6/dist-packages/patroni/dcs/kubernetes.py", line 203, in _handle_server_response
raise k8s_client.rest.ApiException(http_resp=response)
patroni.dcs.kubernetes.K8sClient.rest.ApiException: (422)
Reason: Unprocessable Entity
HTTP response headers: HTTPHeaderDict({'Audit-Id': 'c8949d3d-9984-421a-ad33-3a62c453fd6c', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Kubernetes-Pf-Flowschema-Uid': 'ca267a38-8ca9-4f84-8fd7-25684e895f05', 'X-Kubernetes-Pf-Prioritylevel-Uid': '80ccb9a2-95f6-47fb-bb00-dd34f2f81d54', 'Date': 'Mon, 18 Jul 2022 09:55:39 GMT', 'Content-Length': '753'})
HTTP response body: b'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Endpoints \\"acid-prod-api-config\\" is invalid: metadata.annotations: Too long: must have at most 262144 bytes","reason":"Invalid","details":{"name":"acid-prod-api-config","kind":"Endpoints","causes":[{"reason":"FieldValueTooLong","message":"Too long: must have at most 262144 bytes","field":"metadata.annotations"},{"reason":"FieldValueTooLong","message":"Too long: must have at most 262144 bytes","field":"metadata.annotations"},{"reason":"FieldValueTooLong","message":"Too long: must have at most 262144 bytes","field":"metadata.annotations"},{"reason":"FieldValueTooLong","message":"Too long: must have at most 262144 bytes","field":"metadata.annotations"}]},"code":422}\n'
looks like it is cached somewhere and nothing helps. do you have an idea where can i clean it?
looks like it is cached somewhere and nothing helps. do you have an idea where can i clean it? Hi! I fixed it manually: kubectl exec into pod and run "patronictl edit-config" "maximum_lag_on_failover" belongs patroni layer (config) not postgres. List of parameters in manifest is limited patroni-parameters From here Good luck!
patronictl edit-config
thank you very much. it helped
[SOLVED]
Hi guys thanks for the help!
We are also running the postgres operator and we had the same exception being thrown.
We followed the steps you provided:
kubectl -n tefde-bmi-ci-infra edit ep postgres-hive-metastore-config -o yaml
(This enabled changing the config via patronictl)But still we are getting the same error: metadata.annotations: Too long: must have at most 262144 bytes even though the file is quite small now less than 1000 bytes. We tried uninstalling the operator and installing it back again but it did not solve the issue.
We seem to have the same "cached somewhere problem" as @wasap. We read the last comment from @vbortnikov but we could not understand if we had to set maximum_lag_failover and what value in case.
Any other suggestions?
We were putting the max_timelines_history parameter in the wrong place it goes in the outer part of the config:
max_timelines_history: 10
maximum_lag_on_failover: 33554432
postgresql:
parameters:
archive_mode: 'on'
archive_timeout: 1800s
autovacuum_analyze_scale_factor: 0.02
We were wondering were the config is cached cause we modified manually in k8s but this was not enough.
Happy coding
[SOLVED]
Hi guys thanks for the help!
We are also running the postgres operator and we had the same exception being thrown.
We followed the steps you provided:
- manually shortened endpoint metadata via
kubectl -n tefde-bmi-ci-infra edit ep postgres-hive-metastore-config -o yaml
(This enabled changing the config via patronictl)- changed the config via patronictl (this was initially not possible because we were getting the same metadata-too-long exception), adding the max_timelines_history: 10
But still we are getting the same error: metadata.annotations: Too long: must have at most 262144 bytes even though the file is quite small now less than 1000 bytes. We tried uninstalling the operator and installing it back again but it did not solve the issue.
We seem to have the same "cached somewhere problem" as @wasap. We read the last comment from @vbortnikov but we could not understand if we had to set maximum_lag_failover and what value in case.
Any other suggestions?
We were putting the max_timelines_history parameter in the wrong place it goes in the outer part of the config:
max_timelines_history: 10 maximum_lag_on_failover: 33554432 postgresql: parameters: archive_mode: 'on' archive_timeout: 1800s autovacuum_analyze_scale_factor: 0.02
We were wondering were the config is cached cause we modified manually in k8s but this was not enough.
Happy coding
Connect to each pod with kubectl exec ...
run patronictl edit-config
and add there max_timelines_history: 10
We are facing same issue in our kubernetes deployment.
Could this max_timelines_history
parameter be included to postgres manifest? Currently we are able to set only these. However having the max_timelines_history
set to some specific value, or if default has finite value, we would prevent issues with large manifests: {"reason":"FieldValueTooLong","message":"Too long: must have at most 262144 bytes","field":"metadata.annotations"}
We ran into this issue as well. Seems to be that the Postgres Operator was stuck in a reconcile loop due to some bad node affinity settings which is likely(?) the cause for why it built up a ridiculously large history (thousands of entries in the history annotation).
Had to scale down the Postgres Operator and Postgres Deployment, remove the annotation history from the endpoint, then scale the services back up and apply the max_timelines_history: 10
configuration manually. Thank you @wasap and others for the solution there.
It would be really nice to have a way to set this permanently, or even just by default - what is this history even used for?
Please, answer some short questions which should help us to understand your problem / question better?
Which image of the operator are you using? registry.opensource.zalan.do/acid/postgres-operator:v1.8.0
Where do you run it - cloud or metal? Kubernetes or OpenShift? [AWS K8s | GCP ... | Bare Metal K8s] Bare Metal K8S
Are you running Postgres Operator in production?
yes
Type of issue? [Bug report, question, feature request, etc.]
k -n <> get ep postgres-postgres-config