Open ngc4579 opened 2 years ago
Hi @ngc4579 , no, it's not AFAIK. Can you provide us some more details? Are you able to call the k8s API manually (through a curl
pod) to confirm the cluster config (k8s api server config) is correct?
@jekkel Thanks for your answer... - Well, at least a $ curl --cacert ${CACERT} --header "Authorization: Bearer ${TOKEN}" -X GET ${APISERVER}/api
returns some sensible result as far as I can tell:
{
"kind": "APIVersions",
"versions": [
"v1"
],
"serverAddressByClientCIDRs": [
{
"clientCIDR": "0.0.0.0/0",
"serverAddress": "api.eval.420214.internal.prod.<redacted>:443"
}
]
}
We noticed the same issue around 2 months ago. It seems to be not critical because configmaps are successfully processed. The only sign that something is not right is the error messages in logs.
And this is a stack trace which I managed to retrieve.
Traceback (most recent call last):
File "/app/.venv/lib/python3.10/site-packages/urllib3/response.py", line 700, in _update_chunk_length
self.chunk_left = int(line, 16)
ValueError: invalid literal for int() with base 16: b''
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/app/.venv/lib/python3.10/site-packages/urllib3/response.py", line 441, in _error_catcher
yield
File "/app/.venv/lib/python3.10/site-packages/urllib3/response.py", line 767, in read_chunked
self._update_chunk_length()
File "/app/.venv/lib/python3.10/site-packages/urllib3/response.py", line 704, in _update_chunk_length
raise InvalidChunkLength(self, line)
urllib3.exceptions.InvalidChunkLength: InvalidChunkLength(got length b'', 0 bytes read)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/tmp/dbg.py", line 6, in <module>
for event in stream:
File "/app/.venv/lib/python3.10/site-packages/kubernetes/watch/watch.py", line 165, in stream
for line in iter_resp_lines(resp):
File "/app/.venv/lib/python3.10/site-packages/kubernetes/watch/watch.py", line 56, in iter_resp_lines
for seg in resp.stream(amt=None, decode_content=False):
File "/app/.venv/lib/python3.10/site-packages/urllib3/response.py", line 575, in stream
for line in self.read_chunked(amt, decode_content=decode_content):
File "/app/.venv/lib/python3.10/site-packages/urllib3/response.py", line 755, in read_chunked
with self._error_catcher():
File "/usr/local/lib/python3.10/contextlib.py", line 153, in __exit__
self.gen.throw(typ, value, traceback)
File "/app/.venv/lib/python3.10/site-packages/urllib3/response.py", line 458, in _error_catcher
raise ProtocolError("Connection broken: %r" % e, e)
urllib3.exceptions.ProtocolError: ("Connection broken: InvalidChunkLength(got length b'', 0 bytes read)", InvalidChunkLength(got length b'', 0 bytes read))
The code was used (it was running in a loop because most of the time it works as expected):
from kubernetes import client, config, watch
config.load_incluster_config()
v1 = client.CoreV1Api()
stream = watch.Watch().stream(v1.list_config_map_for_all_namespaces, label_selector='grafana_dashboard=1', timeout_seconds=60, _request_timeout=66)
for event in stream:
print(event['object'].metadata.name)
So far it looks like some networking issues rather than an issue with k8s-sidecar itself.
Related issues in the Kubernetes python client:
The suggested solution (which is actually a workaround) is to implement a retry logic. However, looking at the k8s-sidecar's code it seems to be not really needed, because the retry logic is implemented on a higher level already. Personally, I'd decrease the severity of the corresponding log message from ERROR to WARNING, because, in fact, everything is working fine.
Hm, I am running into the same issue, and can confirm it's still the case with 1.24.3
.
Reconcilation is fine, it does it's job, I cannot confirm any network issues, but I get spammed by msg": "ProtocolError when calling kubernetes: (\"Connection broken: InvalidChunkLength(got length b'', 0 bytes read)\", InvalidChunkLength(got length b'', 0 bytes read))\n", "level": "ERROR"
every 55 seconds 🤔
With watchMethod: SLEEP
those error logs are gone, and every 65 seconds I receive a Performing list-based sync on ...
instead. That's my workaround for now to get rid of the error logs.
This issue has been automatically marked as stale because it has not had any activity in the last 60 days. Thank you for your contributions.
A
quay.io/kiwigrid/k8s-sidecar:1.21.0
container deployed alongside Grafana in a k8s 1.24 evaluation cluster seems to start correctly, but afterwards floods logs withProtocolError
messages as seen below:Is this a known issue?