elastic / elastic-integration-corpus-generator-tool

Command line tool used for generating events corpus dynamically given a specific integration
Other
21 stars 12 forks source link

add counter reset support #152

Open gpop63 opened 2 months ago

gpop63 commented 2 months ago

Overview

This PR introduces the counter reset functionality. It can be configured with 3 strategies: random, probabilistic and after a certain number (after_n).

Strategies:

Example:

Random reset.

  - name: test_counter
    counter: true
    counter_reset:
      strategy: "random"

Reset counter after 5 iterations.

  - name: test_counter
    counter: true
    counter_reset:
      strategy: "after_n"
      reset_after_n: 5

20% chance to reset the counter.

  - name: test_counter
    counter: true
    counter_reset:
      strategy: "probabilistic"
      probability: 20

Test with config

configs.yml

```yaml fields: - name: cloud.region enum: ["us-east-1", "us-east-2", "us-west-1", "us-west-2", "ap-south-1", "ap-northeast-3", "ap-northeast-2", "ap-southeast-1", "ap-southeast-2", "ap-northeast-1", "ca-central-1", "eu-central-1", "eu-west-1", "eu-west-2", "eu-west-3", "eu-north-1", "sa-east-1", "af-south-1", "ap-east-1", "ap-south-2", "ap-southeast-3", "eu-south-2", "eu-central-2", "me-south-1", "me-central-1"] cardinality: 25 - name: cloud.account.id value: "123456789" - name: cloud.account.name value: sample-account - name: aws.billing.currency value: "USD" - name: aws.billing.ServiceName # NOTE: When empty the data refers to estimated charged for the entire account. We cannot reproduce the content (as it's a sum of previous data) but we want to provide the case. enum: ["", "AWSCloudTrail", "AWSCodeArtifact", "AWSConfig", "AWSCostExplorer", "AWSDataTransfer", "AWSELB", "AWSLambda", "AWSMarketplace", "AWSQueueService", "AWSSecretsManager", "AWSServiceCatalog", "AWSSystemsManager", "AWSXRay", "AmazonApiGateway", "AmazonCloudWatch", "AmazonCognito", "AmazonDynamoDB", "AmazonEC2", "AmazonECR", "AmazonEKS", "AmazonKinesis", "AmazonKinesisFirehose", "AmazonRDS", "AmazonRedshift", "AmazonRoute53", "AmazonS3", "AmazonSNS", "AmazonVPC", "awskms"] - name: agent.id value: "12f376ef-5186-4e8b-a175-70f1140a8f30" - name: agent.ephemeral_id value: "5fd278ce-2a12-4a09-a125-0c5b39aa69e3" - name: agent.name value: "host.local" - name: metricset.period value: 86400 - name: aws.billing.group_definition.key # NOTE: repeated values are needed to produce 10% cases with "" value enum: ["", "AZ", "INSTANCE_TYPE", "SERVICE", "LINKED_ACCOUNT", "AZ", "INSTANCE_TYPE", "SERVICE", "LINKED_ACCOUNT"] - name: event.duration range: min: 1 max: 1000 - name: aws.billing.EstimatedCharges cardinality: 25 fuzziness: 0.2 - name: aws.billing.AmortizedCost.amount cardinality: 25 fuzziness: 0.2 - name: aws.billing.BlendedCost.amount cardinality: 25 fuzziness: 0.2 - name: aws.billing.NormalizedUsageAmount.amount cardinality: 25 fuzziness: 0.2 - name: aws.billing.UnblendedCost.amount cardinality: 25 fuzziness: 0.2 - name: aws.billing.UsageQuantity.amount cardinality: 25 fuzziness: 0.2 - name: aws.billing.group_definition.type value: "DIMENSION" - name: aws.billing.group_by.INSTANCE_TYPE enum: ["NoInstanceType", "a1.large", "c5.2xlarge", "c5.xlarge", "c6i.2xlarge", "db.r6g.2xlarge", "db.t2.micro", "dc2.large", "m5.large", "t1.micro", "t2.medium", "t2.micro", "t2.small", "t2.xlarge", "t3.2xlarge", "t3.medium", "t3.xlarge","t3.xlarge"] - name: aws.billing.group_by.SERVICE enum: ["Amazon Simple Storage Service", "Amazon Elastic Compute Cloud - Compute", "EC2 - Other", "Amazon Kinesis", "Amazon Relational Database Service", "Amazon Elastic Load Balancing", "AmazonCloudWatch", "AWS CloudTrail", "AWS Config", "AWS Key Management Service", "AWS Lambda", "AWS Secrets Manager", "AWS Service Catalog", "Amazon API Gateway", "Amazon DynamoDB", "Amazon EC2 Container Registry (ECR)", "Amazon Elastic Container Service for Kubernetes", "Amazon Kinesis Firehose", "Amazon Redshift", "Amazon Simple Notification Service", "Amazon Simple Queue Service", "Amazon Virtual Private Cloud"] - name: test_counter counter: true counter_reset: strategy: "after_n" reset_after_n: 5 ```

fields.yml

```yaml - name: timestamp type: date - name: cloud.region type: keyword - name: cloud.account.id type: keyword - name: cloud.account.name type: keyword - name: event.duration type: long - name: metricset.period type: long - name: aws.billing.currency type: keyword - name: aws.billing.EstimatedCharges type: float # positive - name: aws.billing.ServiceName type: keyword - name: aws.billing.AmortizedCost.amount type: float # positive - name: aws.billing.BlendedCost.amount type: float # positive - name: aws.billing.NormalizedUsageAmount.amount type: integer # positive - name: aws.billing.UnblendedCost.amount type: float # positive - name: aws.billing.UsageQuantity.amount type: integer # positive - name: agent.id type: keyword - name: agent.name type: keyword - name: agent.ephemeral_id type: keyword example: 12f376ef-5186-4e8b-a175-70f1140a8f30 - name: aws.billing.group_definition.key type: keyword - name: aws.billing.start_date type: date - name: aws.billing.group_definition.type type: keyword - name: aws.billing.group_by.INSTANCE_TYPE type: keyword - name: aws.billing.group_by.SERVICE type: keyword - name: test_counter type: long ```

gotext.tpl

``` {{- $currency := generate "aws.billing.currency" }} {{- $groupBy := generate "aws.billing.group_definition.key" }} {{- $period := generate "metricset.period" }} {{- $cloudId := generate "cloud.account.id" }} {{- $cloudRegion := generate "cloud.region" }} {{- $timestamp := generate "timestamp" }} { "@timestamp": "{{$timestamp.Format "2006-01-02T15:04:05.999999Z07:00"}}", "cloud": { "provider": "aws", "region": "{{$cloudRegion}}", "account": { "id": "{{$cloudId}}", "name": "{{generate "cloud.account.name"}}" } }, "event": { "dataset": "aws.billing", "module": "aws", "duration": {{generate "event.duration"}} }, "metricset": { "name": "billing", "period": {{$period}} }, "ecs": { "version": "8.2.0" }, "aws": { "billing": { {{- if eq $groupBy "" }} "Currency": "{{$currency}}", "EstimatedCharges": {{generate "aws.billing.EstimatedCharges"}}, "ServiceName": "{{generate "aws.billing.ServiceName"}}" {{- else }} {{- $sd := generate "aws.billing.start_date" }} "start_date": "{{ $sd.Format "2006-01-02T15:04:05.999999Z07:00" }}", "end_date": "{{ $sd | date_modify (print "+" $period "s") | date "2006-01-02T15:04:05.999999Z07:00" }}", "AmortizedCost": { "amount": {{printf "%.2f" (generate "aws.billing.AmortizedCost.amount")}}, "unit": "{{$currency}}" }, "BlendedCost": { "amount": {{printf "%.2f" (generate "aws.billing.BlendedCost.amount")}}, "unit": "{{$currency}}" }, "NormalizedUsageAmount": { "amount": {{generate "aws.billing.NormalizedUsageAmount.amount"}}, "unit": "N/A" }, "UnblendedCost": { "amount": {{printf "%.2f" (generate "aws.billing.UnblendedCost.amount")}}, "unit": "{{$currency}}" }, "UsageQuantity": { "amount": {{generate "aws.billing.UsageQuantity.amount"}}, "unit": "N/A" }, "group_definition": { "key": "{{$groupBy}}", "type": "{{generate "aws.billing.group_definition.type"}}" }, "test_counter": {{generate "test_counter"}}, "group_by": { {{- if eq $groupBy "AZ"}} "AZ": "{{awsAZFromRegion $cloudRegion}}" {{- else if eq $groupBy "INSTANCE_TYPE"}} "INSTANCE_TYPE": "{{generate "aws.billing.group_by.INSTANCE_TYPE"}}" {{- else if eq $groupBy "SERVICE"}} "SERVICE": "{{generate "aws.billing.group_by.SERVICE"}}" {{- else if eq $groupBy "LINKED_ACCOUNT"}} "LINKED_ACCOUNT": "{{$cloudId}}" {{- end}} } {{- end}} } }, "service": { "type": "aws" }, "agent": { "id": "{{generate "agent.id"}}", "name": "{{generate "agent.name"}}", "type": "metricbeat", "version": "8.0.0", "ephemeral_id": "{{generate "agent.ephemeral_id"}}" } } ```

go run main.go generate-with-template ./gotext.tpl ./fields.yml --config-file ./configs.yml --tot-events 10

Relates: #141 Closes: #124

shmsr commented 2 months ago

Can you also add documentation or examples so that people can refer to the same? These are kind of features that needs to be thoroughly documented so that behavior is understandable.

Also, few comments:

shmsr commented 2 months ago

Also, if a critical change is done here, point this PR to the main issue itself. So that others can also discuss if we are implemented as expected?