Closed ChrisPetr0 closed 2 years ago
Thanks for reporting the issue and providing detailed information. You are correct that the job should run at the top of every hour. We have added your request to the backlog and it will be looked into in future solution releases.
Hi!
We just released v3.2.0 of the solution, and this issue has been fixed.
I see that to fix the issue, ScheduleExpression: rate(1 hour)
is changed to ScheduleExpression: cron(* ? * * * *)
. This could result in the issue becoming intermittent. Here's how:
So this will cause a race condition. Sometimes the S3 folder will get created a few milliseconds before the cron tries to access it and everything will be fine. Other times the cron will fire first, trying to create a partition based on a non-existent S3 folder, and will fail.
To fix this, the ScheduleExpression
should be cron(1 * * * ? *)
so it gets called a minute after folder creation. This does delay the data availability for another minute, but makes sure the partition is successfully created every time.
Thanks for the comment. The partition keys are created with the table. Athena query for adding partition uses partition keys in the table. S3 folder exists or not won't matter. Have you experienced any adding partition query error because folder doesn't exist? If so, can you please provide detail?
Also the new 'hour' s3 folder is created whenever logs are processed and inserted into s3 by kinesis firehose, not at the top of every hour.
Is this a valid cron expression? cron(* ? * * * *)
EventBridge's Define schedule
shows error: Invalid CRON expression
Latest version (v4.0.2) of Security Automation still have this setting
Describe the bug When using Athena-Log-Parser options for HTTP Flood, the
add_athena_partitions.py
is set to once per hour via CloudWatch events. If the CFN Stack is kicked off midway through the hour, then the partitions in AWS Glue pointing to the correct S3 hour of logs isn't updated until midway through the hour. This creates a condition where Athena queries do not scan the correct S3 hour key until the Lambda kicks off updating the AWS Glue partitions.relevent template snippet:
To Reproduce Change QueryScheduledRunTime to
1
(line 294). Change line 1196 toScheduleExpression: !Join ['', ['rate(', !FindInMap ["Solution", "Athena", "QueryScheduledRunTime"], ' minute)']]
Run the CFN template midway through any hour of the day with these params (making sure it completes by :45 after the hour or so):
Modify Kinesis Firehose hints to 60s and 1MB via console.
This works, until the hour changes. Now, the AWS Glue Partition is not updated to point at the next hour in the S3 WAF Logs bucket until the
LambdaAddAthenaPartitionsEventsRule
fires which all depends on the minute within the hour that theLambdaAddAthenaPartitionsEventsRule
resource was created.To reproduce, Associate WAF with ALB. Send requests meeting threshold for flood. Wait until hour changes to next hour. Repeat and watch the flood rule not engage until the next time the
LambdaAddAthenaPartitionsEventsRule
fires.Expected behavior Expect HTTP Flood to add IP to Blacklist after approximately 2 minutes (when threshold is achieved) and revert it back after 5 minutes (when requests stop) for every minute of every hour. This is the expectation with the parameters and modifications described above.
Please complete the following information about the solution:
Screenshots If applicable, add screenshots to help explain your problem (please DO NOT include sensitive information).
Additional context To fix the problem,
should be set to CRON at the top of every hour or to run every minute.