airshipit / airshipctl

A CLI for managing declarative infrastructure.
Apache License 2.0
43 stars 49 forks source link

Need clarification for Real Node Introspection using Airship Setup #300

Closed sumitjadhav1 closed 3 years ago

sumitjadhav1 commented 4 years ago

Describe the bug Real Node introspection fails using Airship setup on Dell PowerEdge servers. Node Remains in Inspecting state forever due to incomplete DHCP transaction (DORA not complete, only DO happens & not RA)

Steps To Reproduce

  1. i) Changes done in provisioning network configuration : 10.23.24.x -> 192.168.121.x
  2. ii) Add physical interface in Provisioing bridge (prov_br)
  3. Applied BMH CR files which has 192.168.110.x iDRAC Network, Registration Step is successful.
  4. Troubleshooting done with packet traces using tcpdump, dnsmasq container logs. PFA logs for reference.

Queries along with above issue :-

  1. Use of NetworkData within Node-folders : manifests/site/test-site/shared/baremetalhost/nodeNN ?
  2. Steps/Extra configuration for Airship setup when using Real Baremetal Nodes (Dell PowerEdge) ---> Just to ensure, we can correct in case of missing some configuration

Point to Note : (Verified) Metal3 Setup works well with Real Baremetal Nodes using metal3-dev-env repository.

Expected behavior

  1. We can successfully on-board real baremetal node using Airship setup

Environment

Airship-Introspection-Issue-DNSMASQ-Container-logs.txt

jezogwza commented 4 years ago

Attached is an example baremetalhost definition files from a downstream lab Missing is the Secret containing the bmc credentials.

rdm9r006o001.tar.gz

sumitjadhav1 commented 4 years ago

Config File Changes

As requested, please find the list of changes in above patch-set.

Let us know if any additional information is required.

Ashughorla commented 4 years ago

logs.tar.gz

Network changes : IP address 10.23.24.x to 192.168.121.x Added new documents for baremetal node.(baremetalhost.yaml and networkdata.yaml)

mattmceuen commented 4 years ago

@sb464f could you please take a look at the logs above, and the issue problem description, in case you've seen this issue before?

sb464f commented 4 years ago

So looking at logs DHCP range looks good. but the /etc/dnsmasq.conf seems to be having different values for DHCP_RANGE. to troubleshoot a bit further worth trying to modify dnsmasq.conf with below entrypoint. for file manifests/function/baremetal-operator/entrypoint/dnsmasq-entrypoint

#!/usr/bin/bash

. /bin/ironic-common.sh

export HTTP_PORT=${HTTP_PORT:-"80"}
DNSMASQ_EXCEPT_INTERFACE=${DNSMASQ_EXCEPT_INTERFACE:-"lo"}

wait_for_interface_or_ip

mkdir -p /shared/tftpboot
mkdir -p /shared/html/images
mkdir -p /shared/html/pxelinux.cfg

# Copy files to shared mount
cp /tftpboot/undionly.kpxe /tftpboot/ipxe.efi /tftpboot/snponly.efi /shared/tftpboot

# Template and write dnsmasq.conf
python3 -c 'import os; import sys; import jinja2; sys.stdout.write(jinja2.Template(sys.stdin.read()).render(env=os.environ))' </etc/dnsmasq.conf.j2 >/etc/dnsmasq.conf

for iface in $( echo "$DNSMASQ_EXCEPT_INTERFACE" | tr ',' ' '); do
    sed -i -e "/^interface=.*/ a\except-interface=${iface}" /etc/dnsmasq.conf
done

exec /usr/sbin/dnsmasq -d -q -C /etc/dnsmasq.conf
sb464f commented 4 years ago

Looks like the dnsmasq.conf is from /shared/dnsmasq.conf and looks to be using correct config from https://opendev.org/airship/airshipctl/src/branch/master/manifests/function/baremetal-operator/config-file/dnsmasq.conf

mattmceuen commented 4 years ago

Can we try running this script to get the namespaced objects and their configurations? https://github.com/airshipit/treasuremap/blob/master/tools/gate/debug-report.sh

jezogwza commented 3 years ago

Close this is working now, reported by user.