fluent / fluent-plugin-s3

Amazon S3 input and output plugin for Fluentd
https://docs.fluentd.org/output/s3
314 stars 218 forks source link

How to add event time to path/s3_object_key_format #325

Closed sayan-rudder closed 3 years ago

sayan-rudder commented 4 years ago

currently I am injecting the event time(log_time) and the output format is json.

<match host.**>
        @type s3
        @id host_out_s3
        @log_level info
        aws_key_id "#{ENV['AWS_KEY_ID']}"
        aws_sec_key "#{ENV['AWS_SECRET_KEY']}"
        s3_bucket "#{ENV['S3_BUCKET_NAME']}"
        s3_region "#{ENV['S3_BUCKET_REGION']}"
        path "host"
        time_slice_format "%Y-%m-%d"
        store_as "gzip"
        s3_object_key_format "%{path}/${tag}/${$.host}/%{time_slice}/host_messages_%{index}.% {file_extension}"
        <format>
          @type json
        </format>
        <inject>
          time_key log_time
          time_type string
          time_format %Y-%m-%dT%H:%M:%S.%NZ
          utc true
        </inject>
        <buffer tag, time,$.host>
          timekey 300
          timekey_use_utc true
          chunk_limit_size 100m
        </buffer>
   </match>

Can I add event/record time by let's say hourly in the path/s3_object_key_format. That is I want to detect which time period logs are present in the file by just looking at the key.

How to then add those to buffer ?

Any help or pointers

repeatedly commented 4 years ago

You can use placeholder. See s3_object_key_format in README: https://github.com/fluent/fluent-plugin-s3/blob/master/README.md

sayan-rudder commented 4 years ago

Hi, Thanks for getting back.

<match host.**>
        @type s3
        @id host_out_s3
        @log_level info
        aws_key_id "#{ENV['AWS_KEY_ID']}"
        aws_sec_key "#{ENV['AWS_SECRET_KEY']}"
        s3_bucket "#{ENV['S3_BUCKET_NAME']}"
        s3_region "#{ENV['S3_BUCKET_REGION']}"
        path "host"
        time_slice_format "%Y-%m-%d"
        store_as "gzip"
        s3_object_key_format "%{path}/${tag}/${$.host}/%{time_slice}/host_messages_%{index}.% {file_extension}"
        <format>
          @type json
        </format>
        <inject>
          time_key log_time
          time_type string
          time_format %Y-%m-%dT%H:%M:%S.%NZ
          utc true
        </inject>
        <buffer tag, time,$.host>
          timekey 300
          timekey_use_utc true
          chunk_limit_size 100m
        </buffer>
   </match>

I want to use the record log_time (that I inject, configuration as above) in the _s3_object_keyformat. But since it's a timestamp, will the buffer chunks make sense to have log_time as one of the inputs.

My aim ideally was to have a prefix as utc hours like 1(would mean utc 1am to 2am logs), 2(would mean utc 2am to 3am logs) and so on..

I am not sure if I'm thinking correctly. Any help will be useful @repeatedly

github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale because it has been open 90 days with no activity. Remove stale label or comment or this issue will be closed in 30 days

kenhys commented 3 years ago

maybe https://docs.fluentd.org/filter/record_transformer will help to modify the record in filter. Then refer it from s3_object_key_format