logstash-plugins / logstash-output-s3

Apache License 2.0
58 stars 151 forks source link

Fatal errors: Errno::ENOENT: No such file or directory #187

Closed chrisghill closed 1 year ago

chrisghill commented 6 years ago

I've been using a dockerized logstash container forwarding to ES and S3 for a few weeks now. I just recently started having fatal errors with the following output:

[FATAL] 2018-07-06 18:16:47.643 [LogStash::Runner] runner - An unexpected error occurred! {:error=>#<Errno::ENOENT: No such file or directory - No such file or directory - /tmp/logstash/8b2eabbb-bbbf-4bc6-b9e8-1f3fcda72b8b/dev-ops.nginx.access/2018/wk27/ls.s3.ce52b777-db28-4de9-a020-4a3f0d771930.2018-07-06T18.01.part0.txt>,
 :backtrace=>["org/jruby/RubyFileTest.java:240:in 'size'",
"/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-output-s3-4.1.1/lib/logstash/outputs/s3/temporary_file.rb:40:in 'size'",
"/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-output-s3-4.1.1/lib/logstash/outputs/s3.rb:343:in 'upload_file'",
"/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-output-s3-4.1.1/lib/logstash/outputs/s3.rb:331:in 'block in rotate_if_needed'",
"/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-output-s3-4.1.1/lib/logstash/outputs/s3/file_repository.rb:83:in 'block in get_factory'",
"/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-output-s3-4.1.1/lib/logstash/outputs/s3/file_repository.rb:26:in 'block in with_lock'",
"org/jruby/ext/thread/Mutex.java:148:in 'synchronize'",
"/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-output-s3-4.1.1/lib/logstash/outputs/s3/file_repository.rb:25:in 'with_lock'",
"/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-output-s3-4.1.1/lib/logstash/outputs/s3/file_repository.rb:83:in 'get_factory'",
"/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-output-s3-4.1.1/lib/logstash/outputs/s3.rb:322:in 'block in rotate_if_needed'",
"org/jruby/RubyHash.java:1405:in 'each_key'",
"uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/set.rb:306:in 'each'",
"/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-output-s3-4.1.1/lib/logstash/outputs/s3.rb:318:in 'rotate_if_needed'",
"/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-output-s3-4.1.1/lib/logstash/outputs/s3.rb:248:in 'multi_receive_encoded'",
"/usr/share/logstash/logstash-core/lib/logstash/outputs/base.rb:90:in 'multi_receive'",
"/usr/share/logstash/logstash-core/lib/logstash/output_delegator_strategies/shared.rb:13:in 'multi_receive'",
"/usr/share/logstash/logstash-core/lib/logstash/output_delegator.rb:49:in 'multi_receive'",
"/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:477:in 'block in output_batch'",
"org/jruby/RubyHash.java:1343:in 'each'",
"/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:476:in 'output_batch'",
"/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:428:in 'worker_loop'",
"/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:386:in 'block in start_workers'"]}
[ERROR] 2018-07-06 18:16:47.674 [LogStash::Runner] Logstash - java.lang.IllegalStateException: org.jruby.exceptions.RaiseException: (SystemExit) exit

I'm wondering what could cause this to start happening all of a sudden. It's been running fine until yesterday. The only change I made yesterday was splitting a pipeline into 2 separate pipelines with separate configurations. I'll also add that logstash will run fine for anywhere for 1-20 minutes, processing thousands of logs, and then it will crash with the above error.

More information about configuration: Container: docker.elastic.co/logstash/logstash:6.2.4 Output config:

output {
  if [@metadata][index_prefix] {
    s3 {
      bucket => "logstash-output"  
      prefix => "/%{[@metadata][index_prefix]}/%{+YYYY}/wk%{+w}/"
      access_key_id => "XXXXX"
      secret_access_key => "XXXXX"
      region => "us-west-2"  
    }
  }
}

Docker compose of logstash container:

version: '3'
services:
  logstash:
    image: docker.elastic.co/logstash/logstash:6.2.4
    restart: always
    ports:
      - "5044-5045:5044-5045"
    volumes:
      - /etc/logstash/config:/usr/share/logstash/config
      - /etc/logstash/pipeline:/usr/share/logstash/pipeline
chrisghill commented 6 years ago

This issue seems to have cleared up over the weekend. I made no changes.

orbiran88 commented 6 years ago

I suffering this issue to! it can happen for 10-20 times and cleared up... every time I made a change to my pipelines I get this behavior...

need some help with this! thanks..

logstash-output-s3-4.1.4 logstash 6.2.4-1

orbiran88 commented 6 years ago

Solve it by configure for each s3 output a different "temporary_directory" :)

chrisghill commented 6 years ago

I just made more pipeline changes and this issue came back. @orbiran88 Can you clarify what you mean by configuring a different "temporary_directory" for each s3 output? You my s3 output config in the original comment. I'm sending data into different folders based off of the index.

chrisghill commented 6 years ago

I seem to have found a workaround. If I temporarily disable all filebeat instances sending to logstash, then restart the logstash instance and allow it time to fully initialize before turning filebeat back on, I don't run into this issue.

So basically I have to temporarily disable all ingest to make any changes to logstash without s3 output causing fatal errors. This really isn't a feasible workaround at any sort of scale.

erickpeirson commented 4 years ago

Solve it by configure for each s3 output a different "temporary_directory" :)

Unfortunately, this does not seem to be working, either. Running logstash 7.4.0. So far have tried:

Here's a typical config:

        s3 {
            aws_credentials_file => "${AWS_CREDENTIALS_FILE}"
            region => "us-east-1"
            bucket => "foo-bucket"
            encoding => "gzip"
            prefix => "foo-logs/%{[host][name]}/%{+YYYY}/%{+MM}"
            time_file => 60
            temporary_directory => "/tmp/logstash/foo-logs"
        }

Here's an example of the exception:

[FATAL] 2019-10-09 13:20:02.233 [LogStash::Runner] runner - An unexpected error occurred! {
    :error=>java.lang.IllegalStateException: org.jruby.exceptions.SystemCallError: (ENOENT) No such file or directory - No such file or directory - /tmp/logstash/foo-logs/86836fee-afb6-4555-9724-2424f2b49c08/foo-logs/bil-foo-001.serverfarm.baz.com/2019/03/ls.s3.164928fc-ea75-4166-9a7e-cc54fb960229.2019-10-09T13.17.part0.txt.gz, 
    :backtrace=>[
        "org.logstash.execution.WorkerLoop.run(org/logstash/execution/WorkerLoop.java:85)",
        "jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)",
        "jdk.internal.reflect.NativeMethodAccessorImpl.invoke(jdk/internal/reflect/NativeMethodAccessorImpl.java:62)",
        "jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(jdk/internal/reflect/DelegatingMethodAccessorImpl.java:43)",
        "java.lang.reflect.Method.invoke(java/lang/reflect/Method.java:566)",
        "org.jruby.javasupport.JavaMethod.invokeDirectWithExceptionHandling(org/jruby/javasupport/JavaMethod.java:440)",
        "org.jruby.javasupport.JavaMethod.invokeDirect(org/jruby/javasupport/JavaMethod.java:304)",
        "RUBY.start_workers(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:243)",
        "org.jruby.RubyProc.call(org/jruby/RubyProc.java:295)",
        "org.jruby.RubyProc.call(org/jruby/RubyProc.java:274)",
        "org.jruby.RubyProc.call(org/jruby/RubyProc.java:270)",
        "java.lang.Thread.run(java/lang/Thread.java:834)"
    ]}
erickpeirson commented 4 years ago

This looks remarkably similar to https://github.com/logstash-plugins/logstash-output-s3/issues/107

erickpeirson commented 4 years ago

Narrowed it down a little bit.

Context: I'm running Logstash as a StatefulSet in Kubernetes, with one persistent volume per instance for queue, dlq, etc. /tmp is pod-local (i.e. not on the persistent volume). Logstash pods come and go fairly regularly, and shut down as gracefully as possible.

Simply using the persistent volume instead of /tmp for temporary S3 files cleared this up entirely. Seems like the S3 plugin is not shutting down so gracefully, and a reference to a file that won't exist upon restart is hanging around.

Since the S3 plugin is using /tmp by default, I would have expected that it would be resilient to restarts where /tmp is not necessarily persistent.

krisnaru commented 4 years ago

i have moved /tmp directory to statefulset , still the problem persists. [2020-07-08T23:59:06,045][FATAL][logstash.runner ] An unexpected error occurred! {:error=>#<Errno::ENOENT: No such file or directory - No such file or directory - /logstorage/temp/0440b0c3-74a2-41fa-ae08-2bf9740735e9/json/events/2020/07/08/ls.s3.7ca4d682-0bb3-4a1c-bc1e-8451428ef31a.2020-07-08T23.57.part0.txt>, :backtrace=>["org/jruby/RubyFileTest.java:240:in size'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-output-s3-4.0.13/lib/logstash/outputs/s3/temporary_file.rb:40:insize'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-output-s3-4.0.13/lib/logstash/outputs/s3.rb:341:in upload_file'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-output-s3-4.0.13/lib/logstash/outputs/s3.rb:329:inblock in rotate_if_needed'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-output-s3-4.0.13/lib/logstash/outputs/s3/file_repository.rb:83:in block in get_factory'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-output-s3-4.0.13/lib/logstash/outputs/s3/file_repository.rb:26:inblock in with_lock'", "org/jruby/ext/thread/Mutex.java:148:in synchronize'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-output-s3-4.0.13/lib/logstash/outputs/s3/file_repository.rb:25:inwith_lock'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-output-s3-4.0.13/lib/logstash/outputs/s3/file_repository.rb:83:in get_factory'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-output-s3-4.0.13/lib/logstash/outputs/s3.rb:320:inblock in rotate_if_needed'", "org/jruby/RubyHash.java:1405:in each_key'", "uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/set.rb:306:ineach'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-output-s3-4.0.13/lib/logstash/outputs/s3.rb:316:in rotate_if_needed'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-output-s3-4.0.13/lib/logstash/outputs/s3.rb:246:inmulti_receive_encoded'", "/usr/share/logstash/logstash-core/lib/logstash/outputs/base.rb:90:in multi_receive'", "/usr/share/logstash/logstash-core/lib/logstash/output_delegator_strategies/shared.rb:13:inmulti_receive'", "/usr/share/logstash/logstash-core/lib/logstash/output_delegator.rb:49:in multi_receive'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:479:inblock in output_batch'", "org/jruby/RubyHash.java:1343:in each'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:478:inoutput_batch'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:430:in worker_loop'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:385:inblock in start_workers'"]}

yaauie commented 1 year ago

Both version 4.4.1 of the stand-alone logstash-output-s3 and version 7.0.1 of logstash-integration-aws have been published including multiple closely-related concurrency fixes, one of which had identical symptoms to those mentioned in this issue.

If you continue to experience these symptoms after upgrading the plugin, please feel free to re-open with further details.