logstash-plugins / logstash-input-s3

Apache License 2.0
57 stars 150 forks source link

Files being unprocessed because of cutoff time calculation part #227

Closed yongkyun closed 3 years ago

yongkyun commented 3 years ago

The calculation part of cutoff time is using Time.now function. But if files in s3 bucket are too many, sometimes Time.now result will be change to next second in list_new_files loop example) previous loop : Time.now => 2021-04-06T20:13:59.996 current loop : Time.now => 2021-04-06T20:14:00.001 If two files have same modified second, first file is not process for next cycle. The other is processed and update sincedb because Time.now's second is changed. When next cycle, first file is not processed because first file's modified time is same sincedb.

related logstash log

[2021-04-07T15:25:58,998][DEBUG][logstash.inputs.s3 ][awswaf][52e504950fc0cf36629d17643a76c2fbd8bc038a15d12c8f56c145d9b56a9fcc] Found key {:key=>"mc/2021/04/07/06/aws-waf-logs-mcpc-prd-mc-cf-waf-kinesis-1-2021-04-07-06-25-50-29f9c269-74a2-4e39-af88-70175e9fab64.gz"} [2021-04-07T15:25:58,999][DEBUG][logstash.inputs.s3 ][awswaf][52e504950fc0cf36629d17643a76c2fbd8bc038a15d12c8f56c145d9b56a9fcc] Object Modified After Cutoff Time {:key=>"mc/2021/04/07/06/aws-waf-logs-mcpc-prd-mc-cf-waf-kinesis-1-2021-04-07-06-25-50-29f9c269-74a2-4e39-af88-70175e9fab64.gz"} [2021-04-07T15:25:58,999][DEBUG][logstash.inputs.s3 ][awswaf][52e504950fc0cf36629d17643a76c2fbd8bc038a15d12c8f56c145d9b56a9fcc] Found key {:key=>"mc/2021/04/07/06/aws-waf-logs-mcpc-prd-mc-cf-waf-kinesis-1-2021-04-07-06-25-50-327f9826-ab6e-4f2b-b181-78653ea1c81c.gz"} [2021-04-07T15:25:58,999][DEBUG][logstash.inputs.s3 ][awswaf][52e504950fc0cf36629d17643a76c2fbd8bc038a15d12c8f56c145d9b56a9fcc] Object Modified After Cutoff Time {:key=>"mc/2021/04/07/06/aws-waf-logs-mcpc-prd-mc-cf-waf-kinesis-1-2021-04-07-06-25-50-327f9826-ab6e-4f2b-b181-78653ea1c81c.gz"} [2021-04-07T15:25:58,999][DEBUG][logstash.inputs.s3 ][awswaf][52e504950fc0cf36629d17643a76c2fbd8bc038a15d12c8f56c145d9b56a9fcc] Found key {:key=>"mc/2021/04/07/06/aws-waf-logs-mcpc-prd-mc-cf-waf-kinesis-1-2021-04-07-06-25-50-7e175120-037f-4068-9552-80ef13f4ae88.gz"} [2021-04-07T15:25:59,000][DEBUG][logstash.inputs.s3 ][awswaf][52e504950fc0cf36629d17643a76c2fbd8bc038a15d12c8f56c145d9b56a9fcc] Added to objects[] {:key=>"mc/2021/04/07/06/aws-waf-logs-mcpc-prd-mc-cf-waf-kinesis-1-2021-04-07-06-25-50-7e175120-037f-4068-9552-80ef13f4ae88.gz", :length=>112}

.......

[2021-04-07T15:26:13,262][DEBUG][logstash.inputs.s3 ][awswaf][52e504950fc0cf36629d17643a76c2fbd8bc038a15d12c8f56c145d9b56a9fcc] Processing {:bucket=>"s3-mcpc-prd-waf-log", :key=>"mc/2021/04/07/06/aws-waf-logs-mcpc-prd-mc-cf-waf-kinesis-1-2021-04-07-06-25-50-7e175120-037f-4068-9552-80ef13f4ae88.gz"} [2021-04-07T15:26:13,262][DEBUG][logstash.inputs.s3 ][awswaf][52e504950fc0cf36629d17643a76c2fbd8bc038a15d12c8f56c145d9b56a9fcc] Downloading remote file {:remote_key=>"mc/2021/04/07/06/aws-waf-logs-mcpc-prd-mc-cf-waf-kinesis-1-2021-04-07-06-25-50-7e175120-037f-4068-9552-80ef13f4ae88.gz", :local_filename=>"/tmp/logstash/aws-waf-logs-mcpc-prd-mc-cf-waf-kinesis-1-2021-04-07-06-25-50-7e175120-037f-4068-9552-80ef13f4ae88.gz"} [2021-04-07T15:26:13,284][DEBUG][logstash.inputs.s3 ][awswaf][52e504950fc0cf36629d17643a76c2fbd8bc038a15d12c8f56c145d9b56a9fcc] Processing file {:filename=>"/tmp/logstash/aws-waf-logs-mcpc-prd-mc-cf-waf-kinesis-1-2021-04-07-06-25-50-7e175120-037f-4068-9552-80ef13f4ae88.gz"}

.....

[2021-04-07T15:26:58,852][DEBUG][logstash.inputs.s3 ][awswaf][52e504950fc0cf36629d17643a76c2fbd8bc038a15d12c8f56c145d9b56a9fcc] Found key {:key=>"mc/2021/04/07/06/aws-waf-logs-mcpc-prd-mc-cf-waf-kinesis-1-2021-04-07-06-25-50-29f9c269-74a2-4e39-af88-70175e9fab64.gz"} [2021-04-07T15:26:58,853][DEBUG][logstash.inputs.s3 ][awswaf][52e504950fc0cf36629d17643a76c2fbd8bc038a15d12c8f56c145d9b56a9fcc] Object Not Modified {:key=>"mc/2021/04/07/06/aws-waf-logs-mcpc-prd-mc-cf-waf-kinesis-1-2021-04-07-06-25-50-29f9c269-74a2-4e39-af88-70175e9fab64.gz"} [2021-04-07T15:26:58,853][DEBUG][logstash.inputs.s3 ][awswaf][52e504950fc0cf36629d17643a76c2fbd8bc038a15d12c8f56c145d9b56a9fcc] Found key {:key=>"mc/2021/04/07/06/aws-waf-logs-mcpc-prd-mc-cf-waf-kinesis-1-2021-04-07-06-25-50-327f9826-ab6e-4f2b-b181-78653ea1c81c.gz"} [2021-04-07T15:26:58,853][DEBUG][logstash.inputs.s3 ][awswaf][52e504950fc0cf36629d17643a76c2fbd8bc038a15d12c8f56c145d9b56a9fcc] Object Not Modified {:key=>"mc/2021/04/07/06/aws-waf-logs-mcpc-prd-mc-cf-waf-kinesis-1-2021-04-07-06-25-50-327f9826-ab6e-4f2b-b181-78653ea1c81c.gz"}