Closed Nr18 closed 6 years ago
The best approach to detect new reports is via S3 if you're working only on your own AWS account. This way, when AWS places a new Cost and Usage report in S3, the S3EventStepFunctionStarter
function starts the processing of the new CUR and creates a new Athena table.
We use the xAcctStepFunctionStarter
function to access Cost and Usage reports in different AWS accounts (i.e. dev & test accounts, or clients), that's why it checks every 5 minutes for new reports. lastProcessedTimestamp
is used, among other things, for an API-like feature in the AthenaQueryMgr class in athena.py
. The idea is that when this API is called, we don't want to query Athena every single time. Instead, the code stores query results in S3, which get updated when a new CUR is processed. Adding examples of how to call this API is WIP, though. But we have xAcctStepFunctionStarter
running for a lot of AWS accounts and it does work well.
Ah cool i changed my pr to disable the scheduled event via a parameter and added the s3 trigger event.
But it seems that sam is ignoring the condition.
One thing with S3 events and SAM is that you can only enable S3 events for buckets that are created in the same SAM template. That might be the reason.
It's a know limitation of SAM, see: https://github.com/awslabs/serverless-application-model/issues/142
I tried the following, removing the S3 event from the Lambda function and adding it to the S3 bucket:
CostUsageReport:
Condition: CreateCurS3BucketEnabled
DependsOn: [ S3EventStepFunctionStarter ]
Type: AWS::S3::Bucket
Properties:
AccessControl: Private
BucketName: !Ref BucketName
LambdaConfigurations:
Event: s3:ObjectCreated:*
Filter:
S3Key:
Rules:
- Name: prefix
Value: !Sub '${ReportPathPrefix}/'
- Name: suffix
Value: Manifest.json
Function: !GetAtt S3EventStepFunctionStarter.Arn
I tried a new condition CreateCurS3BucketEnabled
because the S3 bucket creation was interfering with existing stacks that already have an S3 bucket and a history of Cost and Usage reports in them.
I think we can close this issue. S3 event related issues are covered by https://github.com/concurrencylabs/aws-cost-analysis/pull/7
In the
xacct-step-function-starter.py
file you scan with the following filter attributes:You only write the
lastProcessedTimestamp
value inupdate-metadata.py
and you read it inathena.py
.As far as i can see the
xacct-step-function-starter.py
and thus the Lambda functionxAcctStepFunctionStarter
which is scheduled to run every 5 minshave never effect due to the missingdataCollectionStatus
attribute.What is the best approach to run the detection of new reports via S3 events or scheduled events?