elastic / beats

:tropical_fish: Beats - Lightweight shippers for Elasticsearch & Logstash
https://www.elastic.co/products/beats
Other
105 stars 4.92k forks source link

AWS Module: ELB module aren't refresh permission from IAM #21351

Closed dudumiquim closed 4 years ago

dudumiquim commented 4 years ago

I did the same test with CloudTrail and Vpcflow. Both are working correctly. Using ELB, looks like the application aren't refreshing the token to connect to AWS Api. When I start the process, the queue are read. But, after fews hours, the queue aren't read anymore.

Version: filebeat version 7.9.1 (amd64), libbeat 7.9.1 [ad823eca4cc74439d1a44351c596c12ab51054f5 built 2020-09-01 19:58:51 +0000 UTC]

Operating System: Linux 4.14.193-149.317.amzn2.x86_64 #1 SMP Thu Sep 3 19:04:44 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

The log of the filebeat are:

2020-09-28T12:52:13.072-0300    DEBUG   [input] input/input.go:139  Run input
2020-09-28T12:52:13.072-0300    DEBUG   [input] input/input.go:139  Run input
2020-09-28T12:52:13.867-0300    ERROR   [s3]    s3/input.go:207 SQS ReceiveMessageRequest failed: MissingRegion: could not find region configuration
2020-09-28T12:52:13.867-0300    ERROR   [s3]    s3/input.go:207 SQS ReceiveMessageRequest failed: MissingRegion: could not find region configuration
2020-09-28T12:52:13.867-0300    ERROR   [s3]    s3/input.go:207 SQS ReceiveMessageRequest failed: MissingRegion: could not find region configuration
2020-09-28T12:52:14.720-0300    WARN    [s3]    s3/input.go:299 Half of the set visibilityTimeout passed, visibility timeout needs to be updated
2020-09-28T12:52:14.740-0300    ERROR   [s3]    s3/input.go:304 SQS ChangeMessageVisibilityRequest failed: InvalidParameterValue: Value {hash_ommited} for parameter ReceiptHandle is invalid. Reason: The receipt handle has expired.

My filebeat.yml are:

logging.level: debug
logging.to_files: true
logging.files:
  path: /var/log/filebeat
  name: filebeat
  keepfiles: 7
  permissions: 0644
  rotateeverybytes: 1048576000

filebeat.modules:
- module: aws
  elb:
    enabled: true
    var.queue_url: SQS_URL

filebeat.inputs:
- type: s3
  queue_url: SQS_URL
  expand_event_list_from_field: Records
  enabled: true
  fields:
    tipo_log: "aws.elb"

setup.template:
  name: "%{[fields.tipo_log]}"
  pattern: "%{[fields.tipo_log]}-*"
  settings:
  index.number_of_shards: 1

output.logstash:
  enabled: true
  hosts: ["logstahserver:port"]
kaiyan-sheng commented 4 years ago

@dudumiquim Thank you for reporting this error. So this MissingRegion error message did not show up from the beginning?

ERROR   [s3]    s3/input.go:207 SQS ReceiveMessageRequest failed: MissingRegion: could not find region configuration

I want to make sure I understand this correctly: the var.queue_url did not change but 10 hours later, MissingRegion error message starts to show up and Filebeat stopped ingesting ELB logs?

I will try to reproduce this problem. Thank you again!

elasticmachine commented 4 years ago

Pinging @elastic/integrations-platforms (Team:Platforms)

dudumiquim commented 4 years ago

MissingRegion error message did not show up from the beginning?

Almost all the log message are:

DEBUG   [input] input/input.go:139  Run input
ERROR   [s3]    s3/input.go:207 SQS ReceiveMessageRequest failed: MissingRegion: could not find region configuration

I leave the Filebeat running a lot of hours and I have ~1GB with these lines and sometimes I get a line with SQS ChangeMessageVisibilityRequest failed ...

I want to make sure I understand this correctly: the var.queue_url did not change but 10 hours later, MissingRegion error message starts to show up and Filebeat stopped ingesting ELB logs?

Yes. The queue are the same. I'm not sure if the problem occurs after exactly 10 hours later, but some time between 8 and 10 hours. I started the application and left the computer. When I came back, this error was printed in the log and the queue was with a lot of messages. I restarted the filebeat and the queue started to be processed again.

These line above caught my attention: 2020-09-28T12:52:14.740-0300 ERROR [s3] s3/input.go:304 SQS ChangeMessageVisibilityRequest failed: InvalidParameterValue: Value {hash_ommited} for parameter ReceiptHandle is invalid. Reason: The receipt handle has expired.

Looks like the Filebeat was running with an old access key and when I restarted, the Filebeat got a new access key to call the AWS API.

kaiyan-sheng commented 4 years ago

Ahh thanks for pointing this error message out. The receipt handle has expired error is caused by visibility timeout dropped to 0 before this ChangeMessageVisibilityRequest API could be executed. Do you see at the beginning of your debug log about how long is visibility timeout set to?

kaiyan-sheng commented 4 years ago

The MissingRegion error message is probably caused by the config file. In your config file, do you have all the rest of the filesets disabled? For example:

filebeat.modules:
  - module: aws
    cloudtrail:
      enabled: false
    cloudwatch:
      enabled: false
    ec2:
      enabled: false
    s3access:
      enabled: false
    vpcflow:
      enabled: false
    elb:
      enabled: true
dudumiquim commented 4 years ago

Hi @kaiyan-sheng . I updated to 7.9.2 and the problem are not happens anymore. Thank you per attention.

kaiyan-sheng commented 4 years ago

@dudumiquim Thank you for the update! I will close this issue for now. Feel free to reach out if there's any other problem 🙂

EvanGertis commented 3 years ago

I am running into a similar issue. I am also using the amazon aws module. When I start filebeat I'm hit with this string of errors

2021-03-04T15:10:12.649-0500    ERROR   [input.s3]  s3/input.go:93  getRegionFromQueueURL failed: queueURL is not in format: https://sqs.{REGION_ENDPOINT}.amazonaws.com/{ACCOUNT_NUMBER}/{QUEUE_NAME}  {"queue_url": "<no value>"}
2021-03-04T15:10:12.649-0500    ERROR   [input.s3]  compat/compat.go:121    Input 's3' failed with: getRegionFromQueueURL failed: queueURL is not in format: https://sqs.{REGION_ENDPOINT}.amazonaws.com/{ACCOUNT_NUMBER}/{QUEUE_NAME}
2021-03-04T15:10:12.649-0500    ERROR   [input.s3]  s3/input.go:93  getRegionFromQueueURL failed: queueURL is not in format: https://sqs.{REGION_ENDPOINT}.amazonaws.com/{ACCOUNT_NUMBER}/{QUEUE_NAME}  {"queue_url": "<no value>"}
2021-03-04T15:10:12.649-0500    ERROR   [input.s3]  s3/input.go:93  getRegionFromQueueURL failed: queueURL is not in format: https://sqs.{REGION_ENDPOINT}.amazonaws.com/{ACCOUNT_NUMBER}/{QUEUE_NAME}  {"queue_url": "<no value>"}
2021-03-04T15:10:12.649-0500    ERROR   [input.s3]  compat/compat.go:121    Input 's3' failed with: getRegionFromQueueURL failed: queueURL is not in format: https://sqs.{REGION_ENDPOINT}.amazonaws.com/{ACCOUNT_NUMBER}/{QUEUE_NAME}

Then I eventually run into

Half of the set visibilityTimeout passed, visibility timeout needs to be updated
kaiyan-sheng commented 3 years ago

@EvanGertis What does your aws.yml config look like? Please feel free to open an issue in our discuss forum and we will help you there! Thank you!

EvanGertis commented 3 years ago

@kaiyan-sheng

# Module: aws
# Docs: https://www.elastic.co/guide/en/beats/filebeat/7.10/filebeat-module-aws.html

- module: aws
  cloudtrail:
    enabled: true

    # AWS SQS queue url
    var.queue_url:  ${CLOUDTRAIL_SQS}

    # Process CloudTrail logs
    # default is true, set to false to skip Cloudtrail logs
    # var.process_cloudtrail_logs: false

    # Process CloudTrail Digest logs
    # default true, set to false to skip CloudTrail Digest logs
    # var.process_digest_logs: false

    # Process CloudTrail Insight logs
    # default true, set to false to skip CloudTrail Insight logs
    # var.process_insight_logs: false

    # Filename of AWS credential file
    # If not set "$HOME/.aws/credentials" is used on Linux/Mac
    # "%UserProfile%\.aws\credentials" is used on Windows
    #var.shared_credential_file: /etc/filebeat/aws_credentials

    # Profile name for aws credential
    # If not set the default profile is used
    #var.credential_profile_name: fb-aws

    # Use access_key_id, secret_access_key and/or session_token instead of shared credential file
    #var.access_key_id: access_key_id
    #var.secret_access_key: secret_access_key
    #var.session_token: session_token

    # The duration that the received messages are hidden from ReceiveMessage request
    # Default to be 300s
    #var.visibility_timeout: 300s

    # Maximum duration before AWS API request will be interrupted
    # Default to be 120s
    #var.api_timeout: 120s

    # Custom endpoint used to access AWS APIs
    #var.endpoint: amazonaws.com

    # AWS IAM Role to assume
    #var.role_arn: arn:aws:iam::123456789012:role/test-mb

    # Enabling this option changes the service name from `s3` to `s3-fips` for connecting to the correct service endpoint.
    #var.fips_enabled: false

  cloudwatch:
    enabled: false

    # AWS SQS queue url
    #var.queue_url: https://sqs.myregion.amazonaws.com/123456/myqueue

    # Filename of AWS credential file
    # If not set "$HOME/.aws/credentials" is used on Linux/Mac
    # "%UserProfile%\.aws\credentials" is used on Windows
    #var.shared_credential_file: /etc/filebeat/aws_credentials

    # Profile name for aws credential
    # If not set the default profile is used
    #var.credential_profile_name: fb-aws

    # Use access_key_id, secret_access_key and/or session_token instead of shared credential file
    #var.access_key_id: access_key_id
    #var.secret_access_key: secret_access_key
    #var.session_token: session_token

    # The duration that the received messages are hidden from ReceiveMessage request
    # Default to be 300s
    #var.visibility_timeout: 300s

    # Maximum duration before AWS API request will be interrupted
    # Default to be 120s
    #var.api_timeout: 120s

    # Custom endpoint used to access AWS APIs
    #var.endpoint: amazonaws.com

    # AWS IAM Role to assume
    #var.role_arn: arn:aws:iam::123456789012:role/test-mb

    # Enabling this option changes the service name from `s3` to `s3-fips` for connecting to the correct service endpoint.
    #var.fips_enabled: false

  ec2:
    enabled: false

    # AWS SQS queue url
    #var.queue_url: https://sqs.myregion.amazonaws.com/123456/myqueue

    # Filename of AWS credential file
    # If not set "$HOME/.aws/credentials" is used on Linux/Mac
    # "%UserProfile%\.aws\credentials" is used on Windows
    #var.shared_credential_file: /etc/filebeat/aws_credentials

    # Profile name for aws credential
    # If not set the default profile is used
    #var.credential_profile_name: fb-aws

    # Use access_key_id, secret_access_key and/or session_token instead of shared credential file
    #var.access_key_id: access_key_id
    #var.secret_access_key: secret_access_key
    #var.session_token: session_token

    # The duration that the received messages are hidden from ReceiveMessage request
    # Default to be 300s
    #var.visibility_timeout: 300s

    # Maximum duration before AWS API request will be interrupted
    # Default to be 120s
    #var.api_timeout: 120s

    # Custom endpoint used to access AWS APIs
    #var.endpoint: amazonaws.com

    # AWS IAM Role to assume
    #var.role_arn: arn:aws:iam::123456789012:role/test-mb

    # Enabling this option changes the service name from `s3` to `s3-fips` for connecting to the correct service endpoint.
    #var.fips_enabled: false

  elb:
    enabled: true

    # AWS SQS queue url
    var.queue_url: ${ELB_SQS}

    # Filename of AWS credential file
    # If not set "$HOME/.aws/credentials" is used on Linux/Mac
    # "%UserProfile%\.aws\credentials" is used on Windows
    #var.shared_credential_file: /etc/filebeat/aws_credentials

    # Profile name for aws credential
    # If not set the default profile is used
    #var.credential_profile_name: fb-aws

    # Use access_key_id, secret_access_key and/or session_token instead of shared credential file
    #var.access_key_id: access_key_id
    #var.secret_access_key: secret_access_key
    #var.session_token: session_token

    # The duration that the received messages are hidden from ReceiveMessage request
    # Default to be 300s
    #var.visibility_timeout: 300s

    # Maximum duration before AWS API request will be interrupted
    # Default to be 120s
    #var.api_timeout: 120s

    # Custom endpoint used to access AWS APIs
    #var.endpoint: amazonaws.com

    # AWS IAM Role to assume
    #var.role_arn: arn:aws:iam::123456789012:role/test-mb

    # Enabling this option changes the service name from `s3` to `s3-fips` for connecting to the correct service endpoint.
    #var.fips_enabled: false

  s3access:
    enabled: false

    # AWS SQS queue url
    #var.queue_url: https://sqs.myregion.amazonaws.com/123456/myqueue

    # Filename of AWS credential file
    # If not set "$HOME/.aws/credentials" is used on Linux/Mac
    # "%UserProfile%\.aws\credentials" is used on Windows
    #var.shared_credential_file: /etc/filebeat/aws_credentials

    # Profile name for aws credential
    # If not set the default profile is used
    #var.credential_profile_name: fb-aws

    # Use access_key_id, secret_access_key and/or session_token instead of shared credential file
    #var.access_key_id: access_key_id
    #var.secret_access_key: secret_access_key
    #var.session_token: session_token

    # The duration that the received messages are hidden from ReceiveMessage request
    # Default to be 300s
    #var.visibility_timeout: 300s

    # Maximum duration before AWS API request will be interrupted
    # Default to be 120s
    #var.api_timeout: 120s

    # Custom endpoint used to access AWS APIs
    #var.endpoint: amazonaws.com

    # AWS IAM Role to assume
    #var.role_arn: arn:aws:iam::123456789012:role/test-mb

    # Enabling this option changes the service name from `s3` to `s3-fips` for connecting to the correct service endpoint.
    #var.fips_enabled: false

  vpcflow:
    enabled: true

    # AWS SQS queue url
    var.queue_url: ${VPC_FLOW_LOGS_SQS}

    # Filename of AWS credential file
    # If not set "$HOME/.aws/credentials" is used on Linux/Mac
    # "%UserProfile%\.aws\credentials" is used on Windows
    #var.shared_credential_file: /etc/filebeat/aws_credentials

    # Profile name for aws credential
    # If not set the default profile is used
    #var.credential_profile_name: fb-aws

    # Use access_key_id, secret_access_key and/or session_token instead of shared credential file
    #var.access_key_id: access_key_id
    #var.secret_access_key: secret_access_key
    #var.session_token: session_token

    # The duration that the received messages are hidden from ReceiveMessage request
    # Default to be 300s
    #var.visibility_timeout: 300s

    # Maximum duration before AWS API request will be interrupted
    # Default to be 120s
    #var.api_timeout: 120s

    # Custom endpoint used to access AWS APIs
    #var.endpoint: amazonaws.com

    # AWS IAM Role to assume
    #var.role_arn: arn:aws:iam::123456789012:role/test-mb

    # Enabling this option changes the service name from `s3` to `s3-fips` for connecting to the correct service endpoint.
    #var.fips_enabled: false
kaiyan-sheng commented 3 years ago

@EvanGertis Could you give me an example of what ${VPC_FLOW_LOGS_SQS}, ${ELB_SQS} and ${CLOUDTRAIL_SQS} look like please? I suspect this is related to https://github.com/elastic/beats/issues/24420.