elastic / elastic-serverless-forwarder

Elastic Serverless Forwarder
Other
35 stars 35 forks source link

Increasing cloudwatch log groups subscription limit #560

Closed whc4017 closed 7 months ago

whc4017 commented 9 months ago

Hi, I am exploring the serverless forwarder to forward the logs of lambda from aws account to Elasticsearch. Upon deploying the stack, I found out that it hits the CloudFormation quota limit. To be specific, the parameter value for ElasticServerlessForwarderCloudWatchLogsEvents cannot exceed 4096 bytes. https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cloudformation-limits.html

I tried several ways to workaround this hard limit, namely joining multiple variables to form ElasticServerlessForwarderCloudWatchLogsEvents in CloudFormation Stacks, or deploying multiple forwarders in the same accounts. However both approaches fail. The 1st approach fail due to nested stack containing the single ElasticServerlessForwarderCloudWatchLogsEvents parameter, and the 2nd approach fail due to name clash in the marco.

Will it be possible to support this needs with a new version? Or any other workarounds possible (deploying marco and forwarder separately or using AWS integration instead)? In the future, we would like to send logs from hundreds of lambdas to Elastic Cloud, so this ceiling is blocking us. Thank you for your help.

NYFreddie commented 7 months ago

My company is also running into this limitation. Support just advised that as it's a "known limitation" that it is not a bug. Development is advising to deploy directly with a bash script, but this is not a viable solution for our environment.

In case this helps someone else, though:

https://www.elastic.co/guide/en/esf/current/aws-deploy-elastic-serverless-forwarder.html#aws-serverless-forwarder-direct-deploy

aspacca commented 7 months ago

@whc4017 we are aware of this limit and we are going to state it clearly in the documentation and point to the alternative deployment solution we currently support, that's the one mentioned by @NYFreddie : https://www.elastic.co/guide/en/esf/current/aws-deploy-elastic-serverless-forwarder.html#aws-serverless-forwarder-direct-deploy

The limit is a mix of CloudFormation limits and relying on SAR for deployment, as well of the fact that we want to provide a deployment experience that not only deploys the ESF lambda and its components (like the continuing and replay queue) but also frees the users to manage on their own the triggers for ESF and the IAM permissions required.

In the specific context of SAR, in order to achieve that deployment UX, CloudFormation requires the usage of a Macro. Given the fact that Cloudformation does not support "dynamic" Macro reference, we are also consequentially limited to be able to have only one SAR deployment of ESF per region. That's exactly the reason for the failure of your 2nd approach.

Your 1st approach is currently discussed: as you experienced it will require two separated Parameters (something like ElasticServerlessForwarderCloudWatchLogsEvents and ElasticServerlessForwarderCloudWatchLogsEvents2).

This approach will work for a while, adding multiple separated Parameters, as much as needed to workaround the 4096 bytes limit multiplied by the total number of Parameters in order to fit the bytes size of the ARNs list you need to provide. There's also a CloudFormation limit on the total number of Parameters that can be defined in a single template.

The above summarizes the reasons why, after a certain amount of AWS resources that users want to use as trigger of ESF, SAR deployment it is not the deployment solution they should rely on. Again, we are aware this is currently not clear in ESF documentation, we apologize for that and we are going to address it.

Please note that the alternative deployment method that we offer and support, the direct one, while not based on SAR and not using CloudFormation Parameters, is still based on CloudFormation, and you may hit CloudFormation limits even using this solution. The main difference from SAR deployment is that you can have multiple deployments (since it does not use any Macro), so your 2nd approach, in the direct deployment context, is feasible. To be a little more concrete: we expect that you will be able to need only one direct deployment if two separated Parameters would be enough for you with SAR deployment. But according on how much Parameters you currently need or will need in the future, you might need as well to split the ARNs list into multiple direct deployments.

aspacca commented 7 months ago

@NYFreddie thanks for chiming in and providing the link to the direct deployment solution

Please, see my comment above about the details of the current situation, both for SAR deployment and direct deployment.

We are aware that the direct deployment, while it overcomes the limit of SAR deployment, being a bash script is not a viable solution for every environment, like in your case.

We are already discussing about a pure terraform deployment solution, I cannot currently give you any ETA or expectation, but the value is clear and I will bring your case to discussion.

Thanks

aspacca commented 7 months ago

@whc4017 @NYFreddie

please find that v1.13.0 was released yesterday on SAR, including duplication of CF Parameters (https://github.com/elastic/elastic-serverless-forwarder/pull/627)

This should unblock you for the moment

ioancatana commented 4 months ago

I still have this issue using ElasticServerlessForwarderCloudWatchLogsEvents2

I had split the list of cloudwatch logs arn in half, using terraform,

Error: waiting for Serverless Application Repository CloudFormation Stack.... was not successfully updated. Currently in UPDATE_ROLLBACK_IN_PROGRESS with reason: The following resource(s) failed to create: [ApplicationElasticServerlessForwarderCloudWatchLogsEventabbdbd75abPermission]

sematic version: 1.13.1

@aspacca can you please re-open this issue? If I add 1 new cloudwatch group, it will fail to update due to Policy size.