Closed fracappa closed 11 months ago
This issue is currently awaiting triage.
If Metal3.io contributors determine this is a relevant issue, they will accept it by applying the triage/accepted
label and provide further guidance.
The triage/accepted
label can be added by org members by writing /triage accepted
in a comment.
Please take a look to this issue. @kashifest, @lentzi90 @Rozzii @dtantsur
FYI we've been talking about this on slack https://kubernetes.slack.com/archives/CHD49TLE7/p1693206310793309
Hi there. I'm still dealing with the same problem for a week and I could not find any solution.
The only information I have noticed could be related to potential network configuration.
More specifically, I'm getting this error from the ironic-inspector container within the ironic pod:
2023-08-30 08:01:58.702 1 ERROR ironic_inspector.conductor.manager [-] The periodic ironic_inspector.conductor.manager.sync_with_ironic failed with: Traceback (most recent call last):
File "/usr/lib/python3.9/site-packages/futurist/periodics.py", line 290, in run
work()
File "/usr/lib/python3.9/site-packages/futurist/periodics.py", line 64, in __call__
return self.callback(*self.args, **self.kwargs)
File "/usr/lib/python3.9/site-packages/futurist/periodics.py", line 178, in decorator
return f(*args, **kwargs)
File "/usr/lib/python3.9/site-packages/ironic_inspector/conductor/manager.py", line 233, in sync_with_ironic
ironic_node_uuids = {node.id for node in ironic_nodes}
File "/usr/lib/python3.9/site-packages/ironic_inspector/conductor/manager.py", line 233, in <setcomp>
ironic_node_uuids = {node.id for node in ironic_nodes}
File "/usr/lib/python3.9/site-packages/openstack/resource.py", line 2077, in list
exceptions.raise_from_response(response)
File "/usr/lib/python3.9/site-packages/openstack/exceptions.py", line 263, in raise_from_response
raise cls(
openstack.exceptions.HttpException: HttpException: 401: Client Error for url: http://<IP>:6385/v1/nodes?fields=uuid, Incorrect username or password
: None: None
2023-08-30 08:01:58.702 1 ERROR ironic_inspector.conductor.manager NoneType: None
2023-08-30 08:01:58.702 1 ERROR ironic_inspector.conductor.manager
There seems to be an authentication error on the ironic endpoint, when trying to access the /v1/nodes
subresource but I don't know anything about credentials, at least I didn't set any. Yet, I'm not sure this is related to the root issue.
It should not be a connectivity issue since I can curl <ironic-ip>:port/v1/
and I correctly receive a response message.
Furthermore, I've noticed my ironic pod take a different NIC IP address than the provisioning interface's I atteched the pod to. Like, I have eno1 and eno2, eno1 with a static public IP used to ssh on the server and eno2 with a custom static IP (which in my case I use for the ironic endpoint, e.g. 172..22.0.1). I don't know if this could be the problem.
Thanks to anybody who can help me.
I redeployed the k8s cluster with kubeadm without setting custom configuration (--config=config.yaml) and something changed.
Now the ironic-dnsmasq container, seems trying to give an IP address to the target host thorugh the provisioning interface. Taking its logs, I have this situation:
dnsmasq-dhcp: 925516945 DHCPDISCOVER(enp0s31f6) <MAC-address>
dnsmasq-dhcp: 925516945 tags: enp0s31f6
dnsmasq-dhcp: 925516945 DHCPOFFER(enp0s31f6) 172.23.0.33 <MAC-address>
dnsmasq-dhcp: 925516945 requested options: 1:netmask, 3:router, 12:hostname, 15:domain-name,
dnsmasq-dhcp: 925516945 requested options: 6:dns-server, 26:mtu, 33:static-route, 121:classless-static-route,
dnsmasq-dhcp: 925516945 requested options: 119:domain-search, 42:ntp-server, 120:sip-server
dnsmasq-dhcp: 925516945 bootfile name: /undionly.kpxe
dnsmasq-dhcp: 925516945 server name: 172.23.0.1
dnsmasq-dhcp: 925516945 next server: 172.23.0.1
dnsmasq-dhcp: 925516945 sent size: 1 option: 53 message-type 2
dnsmasq-dhcp: 925516945 sent size: 4 option: 54 server-identifier 172.23.0.1
dnsmasq-dhcp: 925516945 sent size: 4 option: 51 lease-time 1h
dnsmasq-dhcp: 925516945 sent size: 4 option: 58 T1 30m
dnsmasq-dhcp: 925516945 sent size: 4 option: 59 T2 52m30s
dnsmasq-dhcp: 925516945 sent size: 4 option: 1 netmask 255.255.255.0
dnsmasq-dhcp: 925516945 sent size: 4 option: 28 broadcast 172.23.0.255
dnsmasq-dhcp: 925516945 available DHCP range: 172.23.0.10 -- 172.23.0.100
dnsmasq-dhcp: 925516945 client provides name: fall
dnsmasq-dhcp: 3988913982 available DHCP range: 172.23.0.10 -- 172.23.0.100
dnsmasq-dhcp: 3988913982 client provides name: fall
dnsmasq-dhcp: 3988913982 DHCPDISCOVER(enp0s31f6) <MAC-address>
dnsmasq-dhcp: 3988913982 tags: enp0s31f6
dnsmasq-dhcp: 3988913982 DHCPOFFER(enp0s31f6) 172.23.0.33 <MAC-address>
After a while, I got a timeout and this will then be translated into an error during the registration phase of my BareMetalHost resource.
Does somebody have any clue why this happens?
/cc @Rozzii I will take a look also later, IMO it won't be a bug it looks like credential misconfiguration. I will remove the bug label and add question instead.
I have no idea about this I still think this is a misconfiguration. /help
@Rozzii: This request has been marked as needing help from a contributor.
Please ensure the request meets the requirements listed here.
If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help
command.
The problem was I didn't specify BMC credentials encoded in base64 and as such ironic decoded the plaintext credentials resulting in encoding problems. By encoding credentials I solved the problem.
Issue Details
I'm currently using Metal3 to manage my Dell servers equipped with iDRAC BMCs. I've successfully set up a Kubernetes cluster using kubeadm and deployed the Bare Metal Operator (BMO) via the provided deploy.sh script. This is how my bmc manifest looks like:
However, when attempting to register my machines by creating custom BareMetalHost (bmh) resources, I encounter a RegistrationError. Reviewing the logs of the baremetal-operator-controller-manager, I found this message:
Expectation
It seems the BMO receives a poorly formatted message from iDRAC. However, accessing the URL
https://<iDRAC-IP>:443/redfish/v1/Systems/System.Embedded.1
directly in my browser yields a correctly formatted JSON. Therefore, I believe iDRAC is behaving correctly and is reachable, ruling out networking issues.Additional information Although I'm not sure if it's relevant, I'm operating a single-node Kubernetes cluster using kubeadm and tainting it with
kubectl taint nodes --all node-role.kubernetes.io/control-plane-
.Additionally, I logged into the ironic container issuing a curl on the same resource:
https://<iDRAC-IP>:443/redfish/v1/Systems/System.Embedded.1
and successfully receiving the expected json.Environment
/kind bug