Following on from https://github.com/guardian/ophan-housekeeper/pull/7#issuecomment-587402409, it turns out Housekeeper's cloudformation was incorrectly formatted with multiple Statement entries - we didn't get any visible warning of this when we applied it, instead Cloudformation just created a policy that only had the SNS permission, not the DynamoDB permission or the (superfluous) lambda:InvokeFunction - looks like it just found the single lastStatement (there's only supposed to be one Statement in a PolicyDocument) and used that:
This update fixes the formatting, removes the unnecessary lambda:InvokeFunction (which grants the right to invoke any other lambda in the account), and also narrows the permitted DynamoDB actions to just what the Lambda needs - the ability to Query & Delete.
I guess this means PR #5 is finally working, though looking at the code I'm a trifle worried that it's still possible for the lambda runtime to end before it has a chance to complete send the SNS message? (a race condition)
Following on from https://github.com/guardian/ophan-housekeeper/pull/7#issuecomment-587402409, it turns out Housekeeper's cloudformation was incorrectly formatted with multiple
Statement
entries - we didn't get any visible warning of this when we applied it, instead Cloudformation just created a policy that only had the SNS permission, not the DynamoDB permission or the (superfluous)lambda:InvokeFunction
- looks like it just found the single lastStatement
(there's only supposed to be oneStatement
in aPolicyDocument
) and used that:This update fixes the formatting, removes the unnecessary
lambda:InvokeFunction
(which grants the right to invoke any other lambda in the account), and also narrows the permitted DynamoDB actions to just what the Lambda needs - the ability to Query & Delete.With a test deploy of this Cloudformation update in place, [Ophan Housekeeper was finally able delete the Ophan Alerts](https://logs.gutools.co.uk/s/ophan/app/kibana#/doc/581f7b00-c4df-11e9-bc08-9d1af4d9c1d5/logstash-ophan-2020.02.18?id=kUgmWHABBMxQt1Il90L6&_g=()) which had been sent to a permanently bouncing address:
SNS notification was sent...
Housekeeper also logged that it was going to send a message to the SNS topic at 11:55am - but the notification email wasn't sent by AWS until 12:28pm (33 minutes later)!
I guess this means PR #5 is finally working, though looking at the code I'm a trifle worried that it's still possible for the lambda runtime to end before it has a chance to complete send the SNS message? (a race condition)