openshift-metal3 / dev-scripts

Scripts to automate development/test setup for openshift integration with https://github.com/metal3-io/
Apache License 2.0
92 stars 182 forks source link

Suspect yq doesn't always successfully install on 01_install_requirements.sh L100 #1643

Open tshefi opened 4 months ago

tshefi commented 4 months ago

Describe the bug While running 01_install_requirements.sh the script failed/stopped running complaining that yq wasn't found, see tail of initial run: 01_install_requirements-2024-03-13-104741.log

Version / git show-ref 8d1e4db09ef867a71bdf9a1c07e3185238516513 refs/heads/master 8d1e4db09ef867a71bdf9a1c07e3185238516513 refs/remotes/origin/HEAD

To Reproduce Executed on a RHEL 8.9 system, expected a VM based deployment, other than CI_TOKEN, listed below are all the changed I made on my config_root.sh

_export NUM_MASTERS=3 export NUM_WORKERS=0 export MASTER_MEMORY=65536 export MASTER_DISK=120 export MASTER_VCPU=16 export NUM_EXTRA_WORKERS=2 export EXTRA_WORKER_VCPU=8 export EXTRA_WORKER_MEMORY=32768 export EXTRA_WORKER_DISK=120 export OPENSHIFT_RELEASE_STREAM=4.14 export IP_STACK=v4 export PROVISIONING_NETWORK_PROFILE=Disabled export REDFISH_EMULATOR_IGNORE_BOOT_DEVICE=True__

Expected/observed behavior Expected - looking at L100 as yq didn't exist on my system, it should have gotten installed.

Observed - Failed to find/consume yq on L101, maybe it did really get pip installed just needed a waiter or refresh or something before trying to use/call it.

Anyway I manually installed yq via snapd, before I looked into code/logs, subsequent script re-execution continued as seen on 01_install_requirements-2024-03-13-111338.log, however later I hit other issues which I'm now looking into.

Not sure snapd install method accounts for these two (identical?) or only for the second one. $pip3 list | grep yq yq 3.2.3 $yq --version yq 3.2.3

elfosardo commented 4 months ago

I believe the real issue is that you don't have the localtion of yq in your PATH, looking at the logs I can see this: WARNING: The scripts tomlq, xq and yq are installed in '/usr/local/bin' which is not on PATH.

tshefi commented 4 months ago

Good point, Thus I retested on a fresh RHEL 8.9 system, just ran dnf install git make -y . Searched for yq as rpm as well as on pip3 list, no yq found - confirming base OS doesn't have yq installed. I then ran make to start the script which re-failed same error, as you suggested it managed to install yq but "just" fails to update/reload PATH before trying to execute yq.

I rechecked pip3 list, indeed I now do see yq (was) installed and located it: locate yq /usr/local/bin/yq /usr/local/lib/python3.9/site-packages/yq

Printed my current the PATH, only to find as you said no reference to the missing '/usr/local/bin' echo $PATH /usr/local/sbin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin

Can we fix 01_install_requirements.sh so that it also handles the PATH update/reload? Adding on L101 something like:

yq_path=$(find / -name yq -type f -exec dirname {} \; 2>/dev/null | head -n 1)
if [ -n "$yq_path" ]; then
    export PATH="$yq_path:$PATH"
    echo "Added $yq_path to PATH"
else
    echo "yq not found"
fi

If it's sufficient to update PATH only for current script session, or better yet fix it globally and re-source PATH so as to update the PATH for this and any future session/scripts.

elfosardo commented 4 months ago

I'm not sure this needs fixing to be honest, considering that if a path is missing you can just add it to your login shell configuration file and you'll get it automatically once you log in again. Besides that, I find really odd that the default PATH of your user does not include /usr/local/bin since it does on my system and all systems we use for testing, including the ones based on RHEL8. Also seeing that the PATH includes /usr/local/sbin makes me think that something is missing in your system. As far as I can see in a new system the PATH does include /usr/local/bin echo $PATH [...] :/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin Are you running dev-scripts as root maybe? or did you modify your .bashrc or any other shell configuration file in any way?