aws / aws-parallelcluster

AWS ParallelCluster is an AWS supported Open Source cluster management tool to deploy and manage HPC clusters in the AWS cloud.
https://github.com/aws/aws-parallelcluster
Apache License 2.0
818 stars 309 forks source link

SSL trust issues for corporate AMIs #2916

Open dan-schlecht opened 3 years ago

dan-schlecht commented 3 years ago

The ParallelCluster team uses this template to report known issues on github. If you are reporting an issue, please use the 'Bug report' template instead.

Bug description

Provide the following information:

In the sub-stack's user data, I basically export environment variables just before the call to "cfn-init" CloudFormation Call. export PIP_CERT=/etc/ssl/certs/ca-bundle.crt export SSL_CERT_FILE=/etc/ssl/certs/ca-bundle.crt

I've tried adding these variables to /etc/bashrc, /etc/profile, /etc/environment, /etc/profile.d/sh.local, etc/profile.d/ But, cluster creation fails as Master node creation. When I log into the Master node, the environment variables are there and I can manually execute the commands that fail during Master node creation. It's like the environment variables are ignored and new environment variables are setup... I would love to know why this is happening and a better solution than my workaround.

This is the env just before cfn-init is called:

cookbook_version=aws-parallelcluster-cookbook-2.11.0 OLDPWD=/tmp/cookbooks parallelcluster_version=aws-parallelcluster-2.11.0 PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/opt/aws/bin PWD=/tmp/cookbooks/aws-parallelcluster-cookbook-2.11.0 LANG=en_US.UTF-8 HOME= SHLVL=2 chef_version=16.13.16 _region=us-gov-east-1 berkshelfversion=7.0.10 =/bin/env

tilne commented 3 years ago

Am I correct that the issue you're needing to solve is that you need the bootstrapping processes running when cluster nodes are launched to have those environment variables set to the desired values?

I've tried adding these variables to /etc/bashrc, /etc/profile, /etc/environment, /etc/profile.d/sh.local, etc/profile.d/

Are you writing to these files from within the user data? If so I would think you need to source it after writing it in order to have those environment variables defined in the current running process, but you would still need to export them in order to have cfn-init use the same values.

dan-schlecht commented 2 years ago

My current workaround is to export these variables in the Master and Fleet substack template's userdata section just before cfn-init is called. I tried other workarounds by creating an AMI with these env variables set and pointing my cluster config file to the AMI, but the env variables are not set when cfn-init is called. I figure the PCluster python script must be setting up its own env variables and ignoring other env variables

dan-schlecht commented 2 years ago

Sorry to be more clear, I'm not writing to these files in userdata. I'm writing to these files and then creating an AMI image. I update my pcluster config file to point to my custom_ami. My current workaround is to have userdata to export the env variables before cfn-init is called.

yuleiwan commented 2 years ago

Hi @dan-schlecht, Could you create a cluster with --norollback, and check cloud-init-output.log under /var/log dir to get userdata output? If there is any clue about the env variables aren't set properly.

dan-schlecht commented 2 years ago

I updated my userdata to not export the env variables to cause the failure. I've looked over the cloud-init and cloud-init-output files before, but I can't understand why the /etc/bashrc or other files don't seem to have an affect on environment variables during cluster creation. I assumed the python code was building its own env variables and ignoring bash env variables. I can't seem to attach log files to this post.... Can I email you the log files?

yuleiwan commented 2 years ago

@dan-schlecht, Did you figure out the issue? If not, please attach custom userdata and log file to the issue, thanks.

dan-schlecht commented 2 years ago

I have a workaround in place, but my workaround is somewhat involved. It would be nice if pcluster code could inherit bash env variables or somehow be told specific env variables to use during the create process.

tilne commented 2 years ago

I can't seem to attach log files to this post.... Can I email you the log files?

Does it work to paste the relevant bits into a message?