Closed amolari closed 4 years ago
Yes - the log message should be seen in a case when signal is sent to AWS Cloud Formation. Here is an example of log message from /var/log/cloud/aws/install.log
[admin@ip-10-0-10-51:Active:In Sync] ~ # tail -f /var/log/cloud/aws/install.log
2020-07-23T19:33:22.992Z info: [pid: 17712] [scripts/runScript.js] 2020-07-23T19:33:22.992Z info: [pid: 22942] [scripts/verifyDeploymentCompletion.js] Device is in cluster.
2020-07-23T19:33:22.992Z info: [pid: 17712] [scripts/runScript.js] 2020-07-23T19:33:22.992Z info: [pid: 22942] [scripts/verifyDeploymentCompletion.js] Sending DONE signal to CloudFormation.
2020-07-23T19:33:23.304Z info: [pid: 17712] [scripts/runScript.js] 2020-07-23T19:33:23.304Z info: [pid: 22942] [scripts/verifyDeploymentCompletion.js] Signaled Stack for instance: i-01cd2bebe6c8783aa
2020-07-23T19:33:23.305Z info: [pid: 17712] [scripts/runScript.js] 2020-07-23T19:33:23.305Z info: [pid: 22942] [scripts/verifyDeploymentCompletion.js] Signal response: undefined
2020-07-23T19:33:23.306Z info: [pid: 17712] [scripts/runScript.js] 2020-07-23T19:33:23.305Z info: [pid: 22942] [scripts/verifyDeploymentCompletion.js] Finally case got executed.
2020-07-23T19:33:23.325Z info: [pid: 17712] [scripts/runScript.js] /config/cloud/aws/node_modules/@f5devcentral/f5-cloud-libs-aws/scripts/verifyDeploymentCompletion.js exited with code 0
Additional information:
I would like to ask for additional details:
@andreykashcheev I've opened the case 1-6538561390 and uploaded a qkiew of the primary. Yes, the deployment times out and tries to rollback (stack delete)
Thanks for providing qkview! I am looking at this issue and will provide an update today.
Looking at our daily tests runs/results, I can tell that the AWS WAF Autoscale via BIGIQ template was deployed 7 times in last 5 days and all deployments were successful.
Question:
@andreykashcheev I think I've found the issue (on my side). I still have some pre-5.7.0 config parts (need NLB not ELB) and the PolicyDocument still has this code:
"Fn::If": [
"useDefaultCert",
I haven't ported the new actions "cloudformation:ListStackResources" & "cloudformation:SignalResource" to both the if and else parts. I'm unable to test right now but I will asap.
But anyway, the code function should report an error (lack of permissions), isn'it? I see in the code
logger.warn('Unable to signal resource', err);
which I do not see in my logs.
Here is list of changes made on template to enable signaling:
"cloudformation:ListStackResources",
"cloudformation:SignalResource"
"BigipAutoscaleGroup": {
"CreationPolicy": {
"ResourceSignal": {
"Count": {
"Ref": "scalingMinSize"
},
"Timeout": "PT30M"
}
}
The script did not work due to missing Actions on BigipAutoscaleGroup; I was able to replicate the issue after removing actions:
2020-07-27T19:29:12.662Z info: [pid: 18685] [scripts/runScript.js] 2020-07-27T19:29:12.662Z silly: [pid: 12323] [scripts/verifyDeploymentCompletion.js] solution: autoscale
2020-07-27T19:29:12.663Z info: [pid: 18685] [scripts/runScript.js] 2020-07-27T19:29:12.662Z silly: [pid: 12323] [scripts/verifyDeploymentCompletion.js] instance-count: 1
2020-07-27T19:29:12.663Z info: [pid: 18685] [scripts/runScript.js] 2020-07-27T19:29:12.663Z info: [pid: 12323] [scripts/verifyDeploymentCompletion.js] This solution does not require clustering or less than 2 instances were provisioned with deployment.
2020-07-27T19:29:12.666Z info: [pid: 18685] [scripts/runScript.js] 2020-07-27T19:29:12.666Z info: [pid: 12323] [scripts/verifyDeploymentCompletion.js] Sending DONE signal to CloudFormation.
2020-07-27T19:29:12.935Z info: [pid: 18685] [scripts/runScript.js] /config/cloud/aws/node_modules/@f5devcentral/f5-cloud-libs-aws/scripts/verifyDeploymentCompletion.js exited with code 0
After looking at source code, I suspect that we do not see error in logs because getStackResources
method returns empty list:
https://github.com/F5Networks/f5-cloud-libs-aws/blob/586c37eccb873ba369afdf4d1cd67f40679ac6b8/lib/awsCloudProvider.js#L2033
due to missing cloudformation:ListStackResources action.
Today, I did several (~10) deployments using v5.7.0 and they all worked fine; in addition, there were 45 AWS Autoscale deployments using our daily tests and they also worked fine.
@andreykashcheev Thank you for the detailed explanation. I confirm that, after adding the missing Actions, it works as expected
Do you already have an issue opened with F5 support?
Yes
Description
The new signaling function to CloudFormation Stack is not working (never received from the Stack).
I see in /var/log/cloud/aws/install.log
Looking in /config/cloud/aws/node_modules/@f5devcentral/f5-cloud-libs-aws/lib/awsCloudProvider.js for:
as far as I understand the message "Signaled Stack for instance: ${instanceId}" should be logged in case of success, and "Unable to signal resource" logged in case of error. I cannot see neither message in any log file (grep'd all files in /var/log/) on the primary instance. Am I missing something?
Template
f5-aws-cloudformation/supported/autoscale/ltm/via-lb/1nic/existing-stack/bigiq/ v5.7.0
Severity Level
Severity: 5