elastic / beats

:tropical_fish: Beats - Lightweight shippers for Elasticsearch & Logstash
https://www.elastic.co/products/beats
Other
12.16k stars 4.91k forks source link

[Metricbeat] AWS Billing Module accept pairs of group_by values #34193

Open thomascate opened 1 year ago

thomascate commented 1 year ago

Currently the AWS billing module takes in an arbitrary number of group_by dimensions. However it uses each one individually pairing it with a group_by tag. Not having multi dimension search makes the data for consolidated billing much less useful.

For example, If you have 10 accounts and you scan your management account with LINKED_ACCOUNT and SERVICES dimensions. With the current AWS Billing Module, you'll get two sets of documents.

  1. Documents with a summary of total service usage across all accounts.
  2. Documents with a summary of total bill for each account. This means that you cannot drill into how much usage each account had by service.

Ideally the module would accept pairs of group_by values. If you pass those same two group_by dimensions to the api together you get one document per account per service, which allows much deeper filtering.

There are some more details here.

BenB196 commented 1 year ago

Definitely agree that there is room for improvement around the AWS billing module, recently started poking around at it, and noticed the same issues when trying to analyze the various group_by dimensions.

The interesting thing I found, if you set a standard group by dimension (ex: LINKED_ACCOUNT) and then a tag group by dimension, the module will have both group_bys on the document. The one issue with relying on the tag groups, is that tags aren't guaranteed by AWS to be available in on billing events, and even then it puts more onus on the user to set and use billing tags. Having the ability to use to "standard" group_by fields would be ideal.

BenB196 commented 1 year ago

One thing I did consider was potentially use Elasticsearch transforms to repivot this data into something more usable, but haven't gotten around to experimenting with it yet.

thomascate commented 1 year ago

Yeah, I think defaulting to having the tag group_by is what causes the issue.

I don't think you can pivot the data unless you query it with multiple dimensions in the first place. Since that's the only way to get details such as "Total ec2 usage for account foo" as that requires two dimensions to get that specific data back.

I think letting the user give a pair of group_by options of either type is the most flexible. Then they can do 1 dimension, 1 tag like now, or 2 dimensions, or even 2 tags. Really nice would be a list of pairs, and then we just iterate through each pair.

elasticmachine commented 1 year ago

Pinging @elastic/integrations (Team:Integrations)

botelastic[bot] commented 8 months ago

Hi! We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1. Thank you for your contribution!

Gawne commented 6 months ago

Hello we would also value the ability to search on multiple group_by dimensions in order to retrieve more granular info on service usages per account when pulling data from a consolidated management account.

We have raised feature request internally with Elastic.

Thanks, Matt