osism / issues

This repository is used for bug reports that are cross-project or not bound to a specific repository (or to an unknown repository).
https://www.osism.tech
1 stars 1 forks source link

kubernetes: document how to troubleshoot the metallb service #930

Open garloff opened 5 months ago

garloff commented 5 months ago
upgrade | STILL ALIVE [task 'k3s_server_post : Wait for MetalLB resources' is running] ***
upgrade | failed: [manager.systems.in-a-box.cloud] (item=pods in replica sets) => {"ansible_loop_var": "item", "changed": false, "cmd": ["k3s", "kubectl", "wait", "pod", "--namespace=metallb-system", "--selector=component=controller,app=metallb", "--for", "condition=Ready", "--timeout=240s"], "delta": "0:04:00.080462", "end": "2024-03-21 22:29:37.405992", "item": {"condition": "--for condition=Ready", "description": "pods in replica sets", "resource": "pod", "selector": "component=controller,app=metallb"}, "msg": "non-zero return code", "rc": 1, "start": "2024-03-21 22:25:37.325530", "stderr": "timed out waiting for the condition on pods/controller-786f9df989-v8gpv\ntimed out waiting for the condition on pods/controller-786f9df989-74bhd\ntimed out waiting for the condition on pods/controller-786f9df989-vzznn", "stderr_lines": ["timed out waiting for the condition on pods/controller-786f9df989-v8gpv", "timed out waiting for the condition on pods/controller-786f9df989-74bhd", "timed out waiting for the condition on pods/controller-786f9df989-vzznn"], "stdout": "", "stdout_lines": []}
upgrade | ok: [manager.systems.in-a-box.cloud] => (item=ready replicas of controller)
upgrade | ok: [manager.systems.in-a-box.cloud] => (item=fully labeled replicas of controller)
upgrade | ok: [manager.systems.in-a-box.cloud] => (item=available replicas of controller)
garloff commented 5 months ago

Repeating does not help. As it's one of the last steps in upgrade.sh, it's not a significant issue, as all the base services have upgraded successfully before.

berendt commented 5 months ago

Cannot be reproduced. Must be documented as part of day2 operations how to check and repair the MetalLB service.