logstash-plugins / logstash-output-s3

Apache License 2.0
58 stars 151 forks source link

Logstash sometimes ignore the Prefix and creates UUID folders #240

Open tomerifrog opened 3 years ago

tomerifrog commented 3 years ago

Logstash information:

  1. Logstash version: 7.13.2
  2. Logstash installation source: docker
  3. How is Logstash being run: Kubernetes

Running ~10 pods

Description of the problem including expected versus actual behavior: Logstash sometimes creates random-named (UUID) subfolders at the root level bucket while ignoring the configured Prefix.

Configuration:

  s3 {
    access_key_id => "xxxx"
    secret_access_key => "xxxx"
    region => "xxxx"
    bucket => "some_bucket"
    size_file => "104857600"
    time_file => "360"
    prefix => "%{+YYYY}/%{+MM}/%{+dd}"
    codec => "json_lines"
    encoding => "gzip"
    temporary_directory => "/usr/share/logstash/data/s3"
    validate_credentials_on_root_bucket => false
    additional_settings => {
      "force_path_style" => true
    }
  }

The bucket folders tree looks something like this:

1c784b4b-ec31-45f3-8e33-23re1de4f813/
   2021/
      06/
         28/
         29/
            ls.s3.45c80fab-5a34-41c8-999f-5ccc6fg6b8bb.2021-06-29T15.13.part0.txt.gz
            ..
            ..
         30/
2021/
   08/
       18/
          ls.s3.014280e1-266d-459b-b1d5-d3b56g23d590.2021-08-18T18.58.part01.txt.gz
          ls.s3.014280e1-266d-459b-b1d5-d3b56g23d590.2021-08-18T18.58.part02.txt.gz
          ls.s3.014280e1-266d-459b-b1d5-d3b56g23d590.2021-08-18T18.58.part03.txt.gz
          ...
          ...
05acb62e-c76e-4b56-9e5r-ea7caa69749a/
971d92da-2f3f-48fe-853a-c33e2cf680c4/
d491d025-2c49-4fb4-basd-72bacf89595c/

Where only the folder that starts with 2021 is relevant, it contains almost all the files and it usually works fine. The other folders are getting created for no reason and sometimes contain just a few real files with data.

I suspect that it happens only when Logstash is restarted.

Steps to reproduce:

  1. Setup the configuration according to the above
  2. Start multiple instances of Logstash
  3. Try to delete pods / make them restart
  4. Check the bucket subfolders

Any idea?