xcat2 / xcat-core

Code repo for xCAT core packages
Eclipse Public License 1.0
356 stars 170 forks source link

[FVT]Avoid pollute environment variables of child processes while running postbootscripts and postscripts #232

Open neo954 opened 8 years ago

neo954 commented 8 years ago

xCAT pass over one hundred environment variables to its child processes while running postbootscripts and postscripts. In some case, it will pollute the runtime environment of the child process.

I suggest, all the xCAT environments should begin with XCAT_

[root@c712ems2 ~]# updatenode c712f7n06 printenv
c712f7n06: xcatdsklspost: downloaded postscripts successfully
c712f7n06: Wed Sep 30 01:59:28 EDT 2015 Running postscript: printenv
c712f7n06: === DEBUGGING BEGIN ===
c712f7n06: ARCH=ppc64le
c712f7n06: AUDITNOSYSLOG=0
c712f7n06: BLADEMAXP=64
c712f7n06: CFGMGR=
c712f7n06: CFGSERVER=
c712f7n06: CLEANUPXCATPOST=no
c712f7n06: CONSOLEONDEMAND=no
c712f7n06: DATABASELOC=/var/lib
c712f7n06: DB2INSTALLLOC=/mntdb2                                                          
c712f7n06: DHCPINTERFACES='enP2p1s0f0,enP2p1s0f2'                                         
c712f7n06: DHCPLEASE=43200                                                                
c712f7n06: DNSHANDLER=ddns                                                                
c712f7n06: DOMAIN=pok.stglabs.ibm.com                                                     
c712f7n06: ENABLEASMI=no
c712f7n06: ENABLESSHBETWEENNODES=YES
c712f7n06: ENVLIST1=IBM_XLC_LICENSE_ACCEPT=yes IBM_XLF_LICENSE_ACCEPT=yes IBM_ESSL_LICENSE_ACCEPT=yes IBM_PESSL_LICENSE_ACCEPT=yes
c712f7n06: ENVLIST6=IBM_PPE_RTE_LICENSE_ACCEPT=yes
c712f7n06: FORWARDERS=9.114.39.147
c712f7n06: FSPTIMEOUT=0
c712f7n06: GROUP=all,f7,fvt
c712f7n06: HOME=/root
c712f7n06: IBM_ESSL_LICENSE_ACCEPT=yes
c712f7n06: IBM_PESSL_LICENSE_ACCEPT=yes
c712f7n06: IBM_XLC_LICENSE_ACCEPT=yes
c712f7n06: IBM_XLF_LICENSE_ACCEPT=yes
c712f7n06: INSTALLDIR=/install
c712f7n06: INSTALLNIC=mac
c712f7n06: IPMIMAXP=64
c712f7n06: IPMIRETRIES=3
c712f7n06: IPMITIMEOUT=2
c712f7n06: LANG=C
c712f7n06: LC_ADDRESS=C
c712f7n06: LC_ALL=C
c712f7n06: LC_COLLATE=C
c712f7n06: LC_CTYPE=C
c712f7n06: LC_IDENTIFICATION=C
c712f7n06: LC_MEASUREMENT=C
c712f7n06: LC_MESSAGES=C
c712f7n06: LC_MONETARY=C
c712f7n06: LC_NAME=C
c712f7n06: LC_NUMERIC=C
c712f7n06: LC_PAPER=C
c712f7n06: LC_TELEPHONE=C
c712f7n06: LC_TIME=C
c712f7n06: LESSOPEN=||/usr/bin/lesspipe.sh %s
c712f7n06: LOGNAME=root
c712f7n06: MACADDRESS=98:be:94:59:fc:5e
c712f7n06: MACMAC=98:be:94:59:fc:5e
c712f7n06: MAIL=/var/mail/root
c712f7n06: MASTER=9.114.39.149
c712f7n06: MASTER_IP=9.114.39.149
c712f7n06: MAXSSH=8
c712f7n06: MONMASTER=9.114.39.149
c712f7n06: MONSERVER=c712ems2.pok.stglabs.ibm.com
c712f7n06: NAMESERVERS=9.114.39.149,9.12.16.2
c712f7n06: NETWORKS_LINE1=netname=9_114_39_0-255_255_255_0||net=9.114.39.0||mask=255.255.255.0||mgtifname=enP2p1s0f0||gateway=9.114.39.254||dhcpserver=||tftpserver=9.114.39.149||nameservers=||ntpservers=||logservers=||dynamicrange=||staticrange=||staticrangeincrement=||nodehostname=||ddnsdomain=||vlanid=||domain=||disable=||comments=
c712f7n06: NETWORKS_LINE2=netname=10_128_129_0-255_255_255_0||net=10.128.129.0||mask=255.255.255.0||mgtifname=enP2p1s0f2||gateway=<xcatmaster>||dhcpserver=||tftpserver=10.128.129.254||nameservers=||ntpservers=||logservers=||dynamicrange=10.128.129.210-10.128.129.250||staticrange=||staticrangeincrement=||nodehostname=||ddnsdomain=||vlanid=||domain=||disable=||comments=
c712f7n06: NETWORKS_LINES=2
c712f7n06: NFSSERVER=9.114.39.149
c712f7n06: NICCUSTOMSCRIPTS=
c712f7n06: NICEXTRAPARAMS=
c712f7n06: NICHOSTNAMESUFFIXES=
c712f7n06: NICIPS=
c712f7n06: NICNETWORKS=
c712f7n06: NICNODE=
c712f7n06: NICTYPES=
c712f7n06: NODE=c712f7n06
c712f7n06: NODEROUTENAMES=
c712f7n06: NODESETSTATE=boot
c712f7n06: NODESYNCFILEDIR=/var/xcat/node/syncfiles
c712f7n06: NTYPE=compute
c712f7n06: OSPKGDIR=/install/rhels7.2snapshot2/ppc64le
c712f7n06: OSPKGS=wget,ntp,nfs-utils,net-snmp,rsync,yp-tools,openssh-server,util-linux,net-tools,kernel-devel,gcc,pciutils,python-devel,redhat-rpm-config,rpm-build,lsof,tcl,gcc-gfortran,tcsh,tk,autofs
c712f7n06: OSVER=rhels7.2snapshot2
c712f7n06: OTHERPKGDIR=/install/post/otherpkgs/rhels7.2snapshot2/compute_1539a
c712f7n06: OTHERPKGS1=
c712f7n06: OTHERPKGS2=,cuda-deps/dkms,cuda-deps/epel-rpm-macros,
c712f7n06: OTHERPKGS3=,cuda-repo-7-5-local/nvidia-kmod,
c712f7n06: OTHERPKGS4=,cuda-repo-7-5-local/nvidia-uvm-kmod,
c712f7n06: OTHERPKGS5=,cuda-repo-7-5-local/cuda-command-line-tools-7-5,cuda-repo-7-5-local/cuda-core-7-5,cuda-repo-7-5-local/cuda-cublas-7-5,cuda-repo-7-5-local/cuda-cublas-dev-7-5,cuda-repo-7-5-local/cuda-cudart-7-5,cuda-repo-7-5-local/cuda-cudart-dev-7-5,cuda-repo-7-5-local/cuda-cufft-7-5,cuda-repo-7-5-local/cuda-cufft-dev-7-5,cuda-repo-7-5-local/cuda-curand-7-5,cuda-repo-7-5-local/cuda-curand-dev-7-5,cuda-repo-7-5-local/cuda-cusolver-7-5,cuda-repo-7-5-local/cuda-cusolver-dev-7-5,cuda-repo-7-5-local/cuda-cusparse-7-5,cuda-repo-7-5-local/cuda-cusparse-dev-7-5,cuda-repo-7-5-local/cuda-documentation-7-5,cuda-repo-7-5-local/cuda-driver-dev-7-5,cuda-repo-7-5-local/cuda-drivers,cuda-repo-7-5-local/cuda-gdb-src-7-5,cuda-repo-7-5-local/cuda-license-7-5,cuda-repo-7-5-local/cuda-minimal-build-7-5,cuda-repo-7-5-local/cuda-misc-headers-7-5,cuda-repo-7-5-local/cuda-npp-7-5,cuda-repo-7-5-local/cuda-npp-dev-7-5,cuda-repo-7-5-local/cuda-nvrtc-7-5,cuda-repo-7-5-local/cuda-nvrtc-dev-7-5,cuda-repo-7-5-local/cuda-runtime-7-5,cuda-repo-7-5-local/cuda-samples-7-5,cuda-repo-7-5-local/gpu-deployment-kit,cuda-repo-7-5-local/xorg-x11-drv-nvidia,cuda-repo-7-5-local/xorg-x11-drv-nvidia-devel,cuda-repo-7-5-local/xorg-x11-drv-nvidia-libs,
c712f7n06: OTHERPKGS6=,pperte-2.3.0.0-1539a/ppe_rte_license,pperte-2.3.0.0-1539a/pperte-compute,pperte-2.3.0.0-1539a/pperte,pperte-2.3.0.0-1539a/ppe_rte_2300,pperte-2.3.0.0-1539a/pperteman,pperte-2.3.0.0-1539a/ppe_rte_man_2300,pperte-2.3.0.0-1539a/ppertesamples,pperte-2.3.0.0-1539a/ppe_rte_samples_2300,pessl-5.2.0-0-rhels-7.2-ppc64le/pessl-computenode-3264rtempich,pessl-5.2.0-0-rhels-7.2-ppc64le/pessl-computenode,essl-5.4.0-0-rhels-7.2-ppc64le/essl-computenode-6464rte,essl-5.4.0-0-rhels-7.2-ppc64le/essl-computenode,essl-5.4.0-0-rhels-7.2-ppc64le/essl-computenode-3264rte,pessl-5.2.0-0-rhels-7.2-ppc64le/pessl-license,essl-5.4.0-0-rhels-7.2-ppc64le/essl-computenode-3264rtecuda,essl-5.4.0-0-rhels-7.2-ppc64le/essl-license,xlc-13.1.2-0-rhels-7.2-ppc64le/xlc.license-compute,xlf-15.1.2-0-rhels-7.2-ppc64le/xlf.license-compute,xlf-15.1.2-0-rhels-7.2-ppc64le/xlf.rte-compute,xlc-13.1.2-0-rhels-7.2-ppc64le/xlc.compiler-compute,xlf-15.1.2-0-rhels-7.2-ppc64le/xlf.compiler-compute,xlc-13.1.2-0-rhels-7.2-ppc64le/xlc.rte-compute
c712f7n06: OTHERPKGS_INDEX=6
c712f7n06: PATH=//xcatpost://xcatpost:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/opt/ibutils/bin
c712f7n06: PERL_BADLANG=0
c712f7n06: POWERINTERVAL=0
c712f7n06: PPCMAXP=64
c712f7n06: PPCRETRY=3
c712f7n06: PPCTIMEOUT=0
c712f7n06: PRIMARYNIC=mac
c712f7n06: PROFILE=compute
c712f7n06: PROVMETHOD=compute_1539a
c712f7n06: PWD=//xcatpost
c712f7n06: SHAREDTFTP=1
c712f7n06: SHELL=/bin/bash
c712f7n06: SHLVL=4
c712f7n06: SITEMASTER=9.114.39.149
c712f7n06: SNSYNCFILEDIR=/var/xcat/syncfiles
c712f7n06: SSHBETWEENNODES=ALLGROUPS
c712f7n06: SSH_CLIENT=9.114.39.149 47372 22
c712f7n06: SSH_CONNECTION=9.114.39.149 47372 9.114.39.180 22
c712f7n06: SYSPOWERINTERVAL=0
c712f7n06: TFTPDIR=/tftpboot
c712f7n06: TIMEZONE=America/New_York
c712f7n06: UPDATENODE=1
c712f7n06: USENMAPFROMMN=no
c712f7n06: USEOPENSSLFORXCAT=1
c712f7n06: USER=root
c712f7n06: VSFTP=n
c712f7n06: XCATCONFDIR=/etc/xcat
c712f7n06: XCATDPORT=3001
c712f7n06: XCATIPORT=3002
c712f7n06: XCATSERVER=9.114.39.149:3001
c712f7n06: XCATSSLVERSION=TLSv1
c712f7n06: XDG_RUNTIME_DIR=/run/user/0
c712f7n06: XDG_SESSION_ID=16
c712f7n06: ZONENAME=
c712f7n06: _=/usr/bin/env
c712f7n06: === DEBUGGING END ===
c712f7n06: Postscript: printenv exited with code 0
c712f7n06: Running of postscripts has completed.

Here is the code for printing all the environment variables.

[root@c712ems2 ~]# cat /install/postscripts/printenv 
#!/bin/bash
echo === DEBUGGING BEGIN ===
env | sort
echo === DEBUGGING END ===
whowutwut commented 8 years ago

I don't think we should make this kind of change in the product at this time. It would break all existing postscripts that are not aware of the XCAT_ and customer scripts as well.

samveen commented 8 years ago

@whowutwut Can a whole 2nd set of variables with the XCAT_ prefix be added as an interim step, instead of replacing the variable names. This should allow an upgrade path to fixing this problem later, while not breaking existing scripts. Adding a note to that effect in the relevant user doc would atleast help new customer scripts avoid parts of this problem.

whowutwut commented 8 years ago

@samveen Yes, I think that might be a good interim step.

neo954 commented 8 years ago

The point of the bug/issue is not provide another set of environment variables to postbootscripts/postscripts. The point is, avoid pollute environment variables randomly to 3rd party scripts by default. It costs days to debug such a problem, which should be avoid essentially.

zet809 commented 8 years ago

There is a little risk to modify this in 2.11, so move to 2.12. And before modifying ENV param name, we need to send note in maillist to let customer know.

whowutwut commented 8 years ago

What about just documenting a way for users to run a script in a clean environment? For example

_/install/postscripts/executeclean

which could look something like... .

#!/bin/sh

echo "=== $0:Running script in clean shell..."
SCRIPT=${1}
env -i ${SCRIPT} ${@:2}
RC=$?
echo "=== $0: Completed running ${SCRIPT}, RC=${RC}"
zet809 commented 7 years ago

Hi, @whowutwut , 'env -i' will start a script with an empty environment, but if user need to read xCAT exported environment variables, nothing can be got.

So, I think the correct method is renamed the name of those variables by:

  1. send out a notification in maillist that we will rename the variables
  2. mention the renaming in our release notes(such as 2.13.1 release notes if we'd like to implement in this release)
whowutwut commented 7 years ago

Can we create a staggered approach, where we start take the existing variables that we are exporting that we plan to change and echo out a WARN message into the logs so users will see it in their /var/log/xcat/xcat.log .... then export the new variable, For example...

export ARCH='ppc64le'
echo "WARN: ARCH is to be deprecated, use XCAT_ARCH" 
export XCAT_ARCH='ppc64le'
....
export DHCPINTERFACES='enP2p1s0f0,enP2p1s0f2'  
echo "WARN: DHCPINTERFACES is to be deprecated, use XCAT_DHCPINTERFACES" 
export XCAT_DHCPINTERFACES='enP2p1s0f0,enP2p1s0f2'  

Once this is in place, we can send a note out to the xcat-users email distribution and start mentioning it in the release notes.

Also, in the xcatprobe utility, add a check that would look at the scripts under /install/postscripts directory and look for the deprecated VARIABLES and also print a warning msg there to help users detect potential scripts that are using the VARIABLES.

This will take some time to implement but I think we should at least start putting the pieces in place.

zet809 commented 7 years ago

@whowutwut I agree with you, and if we can add them before the 2.13.1 release, it will be better.

zet809 commented 7 years ago

I think we need to add the notifications as @whowutwut suggested in next release.

whowutwut commented 7 years ago

Doesn't seem like this item is being tracked correctly and probably will not get around to it, So removing the milestone and sprint label until we can re-assess.