elastic / beats

:tropical_fish: Beats - Lightweight shippers for Elasticsearch & Logstash
https://www.elastic.co/products/beats
Other
96 stars 4.92k forks source link

[Functionbeat] Allow to automatically subscribe to new cloudwatch/sqs/kinesis trigger. #10756

Closed ph closed 8 months ago

ph commented 5 years ago

The description below is relevent for SQS, Kinesis and cloudwatch log group.

The current workflow of Functionbeat requires you to know beforehand the name of the cloudwatch log group, you cannot register a wildcard. This cause problem when you have a really large number of log groups or you are creating them and removing dynamically.

We should add an autodiscover mode to Functionbeat that will allow you to listen from events coming from your AWS infrastructure to automatically subscribe new log group to your Functionbeat lambda.

ph commented 5 years ago

@kvch This is really an interesting feature that might interest you, I am not sure of the implementation yet, I presume that the permissions will be hardest part here.

tfendt commented 5 years ago

+1 on this. Please allow a wild card options for log_group_name. We have 1000+ log groups in one environment. They get added automatically (and removed) when a new serverless service is created and deployed which in our dev environment happens frequently.

spectorar commented 5 years ago

+1 to the wildcard solution. One thing I dislike about AWS's Cloudwatch -> Elasticsearch Service feature is that it has to be configured per log group. As others have said, there could be a lot of log groups. Allowing wildcards/prefixes in the config for FunctionBeat would greatly simplify this. An autodiscover feature feels too broad. Every new cloudwatch log group would be too much for our use case.

phillpafford commented 5 years ago

+1 for wildcard

jasonslater2000 commented 5 years ago

+1 on this as well, without wildcard/autodiscover, configuration would be untenable

jderose9 commented 5 years ago

I'm struggling to see how functionbeat can be useful at scale without this feature.

johnnydimas commented 4 years ago

This is a must-have!

charlieparkes commented 4 years ago

We were planning to use functionbeat, but without this option, I'm not sure how it's even possible at scale.

cluggas commented 4 years ago

Rather than wildcard the log groups, would you consider removing the need for the triggers section in the functionbeat.yml file entirely?

AWS treats the Log Group Subscription and the Subscription Target as different resources, and only couples them in one direction allowing Subscriptions to scale up independently of Targets. By requiring the triggers list in functionbeat.yml the coupling goes in the other direction from Target to Subscription, which makes for an inflexible and unscalable architecture.

Could triggers be made optional? I realise it is required if you are using the Manager to deploy the Log Group Subscriptions. But anyone using this at scale probably won't be using Manager.

kvch commented 4 years ago

@cluggas You raise a valid point. Could you please open a separate issue for your request?

kaykhancheckpoint commented 4 years ago

Did anything come of this? has anyone found an alternative solution to deal with auto discovering cloud watch log groups,. whilst we wait for an official solution ?

greendad commented 4 years ago

@kaykhancheckpoint One way would be standardising all the log groups that I am interested in with the same prefix in name and I am injecting the log groups dynamically into functionbeat configuration when deploying it. So when a new log group is added, I would just trigger the functionbeat deployment again(through a trigger) automatically. I am not sure if it exactly fits your use case. I haven't implemented the trigger yet FYI.

swiftmas commented 4 years ago

We have been looking at a solution that would be functional with @cluggas suggestion.

If Functionbeat did not require a triggers list and instead dealt with any triggers assigned to the lambda you could create hooks in your service deployments which attach their log groups to functionbeat. Permissions are the only thing in the configuration that can take a wild card so they can be grouped by naming convention.

Nighthawk14 commented 3 years ago

We've encountered the same issue as do not use the functionbeat manager and we would prefer to have the triggers object as optional because all our log groups subscriptions are managed via our IaC tool (Terraform).

As a workaround we've changed the log group subscriptions to point to Kinesis (Log Group Subscription > Kinesis Data Stream > Functionbeat Lambda). The triggers object is still mandatory but at least it is reduced to a single line (the Kinesis stream ARN) and the lambda trigger from Kinesis can be managed via Terraform.

Pidz-b commented 3 years ago

Any progress, we'd love to use this beat in our microservice architecture, but the way it works currently is very hour intensive...

malachantrio commented 3 years ago

Without this feature functionbeat is so unwieldy. Its kinda crazy it was designed so that you have to explicitly name every function and log group you want to ship in one monolithic config file.

For us this basically means that every new service has a lengthy back and forth between devs and the platform team trying to get the config right on both ends so that functionbeat actually ships the logs

malachantrio commented 3 years ago

Has anyone found an alternative solution to deal with auto discovering cloud watch log groups,. whilst we wait for an official solution ?

Unfortunately not, it seems that while you can fool the serverless deploy into wildcarding the invoke permissions on the functionbeat lambda itself, functionbeat still seems to double check a log event (that AWS has already allowed through to the functionbeat lambda) comes from a source that is explicitly mentioned in the functionbeat config yml file. We thought we had it cracked and we could get apps to self register, but it's looking like actually functionbeat has foiled us yet again :(

tehmaspc commented 3 years ago

@Nighthawk14

We've encountered the same issue as do not use the functionbeat manager and we would prefer to have the triggers object as optional because all our log groups subscriptions are managed via our IaC tool (Terraform).

As a workaround we've changed the log group subscriptions to point to Kinesis (Log Group Subscription > Kinesis Data Stream > Functionbeat Lambda). The triggers object is still mandatory but at least it is reduced to a single line (the Kinesis stream ARN) and the lambda trigger from Kinesis can be managed via Terraform.

This is the model I want to enhance my Fnb deployment model to since we've used this before w/ other tools (e.g. CW Logs->Kinesis (via Subscription filter)->AWS ES)

But how are you doing it w/ Fnb to then ensure that CW Logs that can represent diff services send their data to Kinesis which Fnb monitors and then ensure that Fnb sends to ES into the proper index? Are you doing any ETL'ing w/i Fnb, using Ingest Pipelines on the ES side? Curious as to your approach. Thanks!

Nighthawk14 commented 3 years ago

@tehmaspc We don't do any specifig ETL on the data and we don't have any Ingest pipeline on the ES side. The only thing we have is in functionbeat.yml configuration file we add this:

processors:
  - add_host_metadata: ~
  - add_cloud_metadata: ~

Which will add the log-stream name into your ES document giving you the exact service (assuming each of your services write CW log in a separate log-stream inside the same log-group). So in ES we get the log-group value being the name of our service, and the log-stream value being the ID of the specific service instance.

kumarnarendra701 commented 3 years ago

Any progress on auto-discover or add multiple log-group?

LBalsa commented 3 years ago

+1

eyearian-oak9 commented 3 years ago

+1

shaundclements commented 2 years ago

would it be possible to pass in an override of the functionbeat.yml file i.e. when performing the export function include an array of identified lambdas as a parameter -E "functionbeat.provider.aws.functions.triggers=...

ravikesarwani commented 2 years ago

BTW did you take a look at the elastic-serverless-forwarder. Here's a blog to get a high level view. This is something new that we are investing in and like to hear practitioners usage and feedback on solving real customer pain points.

KeyanatGiggso commented 2 years ago

elastic-serverless-forwarder Hey @ravikesarwani, Got the Blog, Want to know about the Logstash Compatibility and the custom processors like in the other beats, Can you brief about it ?

kaykhan commented 2 years ago

I saw this issue a while back and continued with just manually adding lambda function triggers manually. But we ran into an issue recently which seems inevitable. when you start to list a lot of lambda function 40+ you will come across this error

The final policy size (20576) is bigger than the limit (20480). (Service: AWSLambdaInternal; Status Code: 400; Error Code: PolicyLengthExceededException; Request ID: 4815dab8-f579-439b-977a-dd71810d168a; Proxy: null)

The functionbeat lambda function creates a bunch of permissions one for each lambda you declare and is then blockd by the maximum policy size that aws has a hard limit on.

Ultimately it means now we can't log any new lambda functions.

tehmaspc commented 2 years ago

I saw this issue a while back and continued with just manually adding lambda function triggers manually. But we ran into an issue recently which seems inevitable. when you start to list a lot of lambda function 40+ you will come across this error

The final policy size (20576) is bigger than the limit (20480). (Service: AWSLambdaInternal; Status Code: 400; Error Code: PolicyLengthExceededException; Request ID: 4815dab8-f579-439b-977a-dd71810d168a; Proxy: null)

The functionbeat lambda function creates a bunch of permissions one for each lambda you declare and is then blockd by the maximum policy size that aws has a hard limit on.

Ultimately it means now we can't log any new lambda functions.

Yes - you will hit this limit if you have too many triggers within each Lambda (or Functionbeat) configuration. What I ultimately did is define some logical grouping of Functionbeats and documented the limits or number of triggers we put into any individual Functionbeat - that way we can group as many triggers into a single FnB but try not to have too many FnBs to reduce management and IP address exhaustion (since all our Functionbeats deliver data to an onprem Elasticsearch cluster).

For my business I defined a policy that each high-level service or team get's it's own Functionbeat config file (per AWS account), and within each config file we define our environment specific Functionbeats to enable our service teams to deploy separate Functionbeat's for diff environments of their services. In most cases, these teams will have a few Functionbeats per AWS account delivering log data for their set of microservices. When we can't find a clean dividing line, I allow the teams to have more than one Functionbeat config file - while balancing resources as best as possible.

E.g. - FnB repo rough setup

config/
  /pre-prod
    <team1>-<service1>.yml
             {fnb-team1-service1-dev, fnb-team1-service1-test, fnb-team1-service1-stag}
    <team2>-<service1>.yml
    ...
  /prod
       <team1>-<service1>.yml
          {fnb-team1-service1-prd}
    ...
kaykhan commented 2 years ago

@tehmaspc Yes, so you are deploying multiple functionbeat cloudwatgch lambda functions (batching 40 at a time). it's a solution, but it is not so elegant. thanks

kaykhan commented 2 years ago

@tehmaspc Im curious if you have thought about using kinesis stream instead. For each lambda function you declare it produces a log group and in a the log group you can set a log subscription (subscription filter) and point to a kinesis data stream.

tehmaspc commented 2 years ago

Hey @kaykhan - I haven't used Kinesis Stream w/ Filebeat but prior to going the Filebeat route we were successfully using Kinesis Firehose to collect data from all the AWS CW Log Groups we were interested in and sending that log data to the data sinks that Kinesis Firehose supported. At that time we were using AWS' ES service (now OpenSearch) and it worked great. Our business simply wanted to move log data in-house. If you can go the AWS CW Log Group->Firehose->{Splunk, OpenSearch, etc} route I'd recommend it - it works well. The only complication is figuring out how to properly store the data to your sink - e.g. - if you send it from Firehose to OpenSearch - you may not want all your data coming from CW Logs to all go into one index - so you need to figure out how to split out the data - either upstream via more Firehose infra or downstream at the OpenSearch layer across multiple index families/etc.

In addition, when we used Firehose to deliver all this data to our AWS ES infrastructure - we set up the subscription filter per log group via Terraform to the proper environment specific Kinesis Firehose stream during application stack deployment (within our deployment pipelines) - and thus very seamless to our development teams.

HTH!

Cheers, Tehmasp

ravikesarwani commented 2 years ago

@KeyanatGiggso Sorry I missed the earlier question from you. Elastic Serverless Forwarder supports both Elastic cloud and self-managed Elastic deployments. Logstash output is not supported as of right now, but it's in the roadmap. Common pre-processing functions like Include/Exclude filters, tags, json content discovery, and multiline processing are already supported by Elastic Serverless forwarder. Other custom transformation tasks can be implemented using Elasticsearch ingest pipelines.

Like to understand your use case in details so that we can help you utilize Elastic serverless forwarder.

Barhuumi commented 2 years ago

The auto subscription can be handled by yourself pretty easily. I'd also recommend Kinesis or SQS as your integration point rather than going direct to the lambda as it provides better for control for the number of concurrent lambdas we spin up.

You can consume an event from Cloudtrail every time a new cloudwatch log group is created, and have your event handler set a subscription filter pointing to your stream. (I guess you could so the same if you opt for the lambda route and set your lambda as the target in the subscription filter).

The below blog post shares the concept and has examples.

https://theburningmonk.com/2017/08/centralised-logging-for-aws-lambda/ https://theburningmonk.com/2018/07/centralised-logging-for-aws-lambda-revised-2018/

botelastic[bot] commented 1 year ago

Hi! We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1. Thank you for your contribution!