Open dan-schlecht opened 3 years ago
Am I correct that the issue you're needing to solve is that you need the bootstrapping processes running when cluster nodes are launched to have those environment variables set to the desired values?
I've tried adding these variables to /etc/bashrc, /etc/profile, /etc/environment, /etc/profile.d/sh.local, etc/profile.d/
Are you writing to these files from within the user data? If so I would think you need to source it after writing it in order to have those environment variables defined in the current running process, but you would still need to export them in order to have cfn-init use the same values.
My current workaround is to export these variables in the Master and Fleet substack template's userdata section just before cfn-init is called. I tried other workarounds by creating an AMI with these env variables set and pointing my cluster config file to the AMI, but the env variables are not set when cfn-init is called. I figure the PCluster python script must be setting up its own env variables and ignoring other env variables
Sorry to be more clear, I'm not writing to these files in userdata. I'm writing to these files and then creating an AMI image. I update my pcluster config file to point to my custom_ami. My current workaround is to have userdata to export the env variables before cfn-init is called.
Hi @dan-schlecht,
Could you create a cluster with --norollback
, and check cloud-init-output.log under /var/log
dir to get userdata output? If there is any clue about the env variables aren't set properly.
I updated my userdata to not export the env variables to cause the failure. I've looked over the cloud-init and cloud-init-output files before, but I can't understand why the /etc/bashrc or other files don't seem to have an affect on environment variables during cluster creation. I assumed the python code was building its own env variables and ignoring bash env variables. I can't seem to attach log files to this post.... Can I email you the log files?
@dan-schlecht, Did you figure out the issue? If not, please attach custom userdata and log file to the issue, thanks.
I have a workaround in place, but my workaround is somewhat involved. It would be nice if pcluster code could inherit bash env variables or somehow be told specific env variables to use during the create process.
I can't seem to attach log files to this post.... Can I email you the log files?
Does it work to paste the relevant bits into a message?
The ParallelCluster team uses this template to report known issues on github. If you are reporting an issue, please use the 'Bug report' template instead.
Bug description
Provide the following information:
Due to security concerns, all of our corporate AMIs have injected our root CA in the trust chain for pip installs and curl/openssl downloads Thus, create command fails for cluster creation. As a workaround for create, I download and update yaml templates files to update user data sections for master and fleet servers. I also have to download and update the main json stack file to point to my new Master server sub-stack file. I also have to use the undocumented config parameter hit-template to point to my local fleet substack file
In the sub-stack's user data, I basically export environment variables just before the call to "cfn-init" CloudFormation Call. export PIP_CERT=/etc/ssl/certs/ca-bundle.crt export SSL_CERT_FILE=/etc/ssl/certs/ca-bundle.crt
I've tried adding these variables to /etc/bashrc, /etc/profile, /etc/environment, /etc/profile.d/sh.local, etc/profile.d/
But, cluster creation fails as Master node creation.
When I log into the Master node, the environment variables are there and I can manually execute the commands that fail during Master node creation. It's like the environment variables are ignored and new environment variables are setup...
I would love to know why this is happening and a better solution than my workaround.
This is the env just before cfn-init is called:
cookbook_version=aws-parallelcluster-cookbook-2.11.0 OLDPWD=/tmp/cookbooks parallelcluster_version=aws-parallelcluster-2.11.0 PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/opt/aws/bin PWD=/tmp/cookbooks/aws-parallelcluster-cookbook-2.11.0 LANG=en_US.UTF-8 HOME= SHLVL=2 chef_version=16.13.16 _region=us-gov-east-1 berkshelfversion=7.0.10 =/bin/env