Closed Nr18 closed 6 years ago
Hi, thanks for reporting this issue and sorry to hear it caused some trouble. Just writing to acknowledge it and to let you know that we'll take a look at it later today.
Thanks!
So played around with it a bit and got it working, but i did need to change a few things:
In the processor.py i changed:
monthDestPrefix = self.destPrefix + period_prefix
to:
monthDestPrefix = '{}{}/{}'.format(self.destPrefix, self.accountId, period_prefix)
Because the Athena table is expecting the accountId to be in the path and i removed the placeholder
string in the destination path to correct it. I need to re-test it and then i will commit my changes to my fork and submit a pull request. The issue above was caused due to the fact that athena was querying a empty path.
It sounds like accountId was somehow not set at the beginning of the Step Function execution.
The Step Function takes as an input a dictionary that includes, among other things, accountId. This dictionary gets passed from one step to the next. For some reason when it was time to execute function init-athena-queries
, accountId was missing and the execution failed.
One recommended way to start the step function is by using the starter function s3event-step-function-starter.py
, which gets triggered by an S3 event whenever a new Cost and Usage report is placed in the source S3 bucket. This function sets accountId, which should make it eventually to the init-athena-queries
table.
If you're starting the Step Function differently, just make sure that it has a dictionary that includes year, month, sourceBucket, sourcePrefix, destBucket, destPrefix and accountId. And optionally, xAccountSource and roleArn if you're accessing Cost and Usage reports cross-account.
I will test a code update that double checks for accountId at the end of the process-cur
function and sets it, in case it wasn't set by the step function starter.
@concurrencylabs i am using the s3event-step-function-starter.py
function via an event trigger on the bucket that receives the reporting from AWS.
I pushed my code and created PR #7 so that you can see what i did, i'm running the whole thing as I described in the README.md
The accountid
is in the event it was not used in the path to upload the processed csv files as far as i could see, but again you can see that in the PR.
If i need to change something in the PR i'm happy to do that, i only added the stuff that was missing for me to get started with this project so that i could start playing around with queries in Athena.
Thanks for the PR, will take a look and leave it running in a test environment.
Hi,
I'm trying to setup this project using the CloudFormation template (PR is coming if i have it working) and in the StepFunction the following error is raised:
So i added a debug statement to print the
queryresults
in theget_query_execution_results
function: log.info("Error: {}".format(json.dumps(queryresults)))That results in:
Due to the empty
Data: {}
the script will fail can i ignore this? or is this caused by a misconfiguration? Thanks!