logstash-plugins / logstash-integration-aws

Apache License 2.0
7 stars 17 forks source link

S3-output doesn't remove temporary dir. #28

Closed mashhurs closed 2 months ago

mashhurs commented 1 year ago

Logstash information:

Please include the following information:

  1. Logstash version (e.g. bin/logstash --version): 8.6.2
  2. Logstash installation source (e.g. built from source, with a package manager: DEB/RPM, expanded from tar or zip archive, docker): downloaded from Logstash download website and expanded from tar or zip archive
  3. How is Logstash being run (e.g. as a service/service manager: systemd, upstart, etc. Via command line, docker/kubernetes): see the explanation below
  4. How was the Logstash Plugin installed: embedded integration-aws-v7.0.1

JVM (e.g. java -version):

If the affected version of Logstash is 7.9 (or earlier), or if it is NOT using the bundled JDK or using the 'no-jdk' version in 7.10 (or higher), please provide the following information:

  1. JVM version (java -version)
  2. JVM installation source (e.g. from the Operating System's package manager, from source, etc).
  3. Value of the JAVA_HOME environment variable if set.
    [2023-05-25T09:42:29,872][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"8.6.2", "jruby.version"=>"jruby 9.3.10.0 (2.6.8) 2023-02-01 107b2e6697 OpenJDK 64-Bit Server VM 17.0.6+10 on 17.0.6+10 +indy +jit [x86_64-mswin32]"}

    OS version (uname -a if on a Unix-like system): any OS but FileUtils.rm_r error happens on Windows

Description of the problem including expected versus actual behavior: 1) FileUtils.rm_r fails

    :backtrace=>[
    "org/jruby/RubyDir.java:471:in `rmdir'", 
    "C:/Program Files/logstash/vendor/jruby/lib/ruby/stdlib/fileutils.rb:1459:in `block in remove_dir1'", 
    "C:/Program Files/logstash/vendor/jruby/lib/ruby/stdlib/fileutils.rb:1473:in `platform_support'", 
    "C:/Program Files/logstash/vendor/jruby/lib/ruby/stdlib/fileutils.rb:1458:in `remove_dir1'", 
    "C:/Program Files/logstash/vendor/jruby/lib/ruby/stdlib/fileutils.rb:1451:in `remove'", 
    "C:/Program Files/logstash/vendor/jruby/lib/ruby/stdlib/fileutils.rb:780:in `block in remove_entry'", 
    "C:/Program Files/logstash/vendor/jruby/lib/ruby/stdlib/fileutils.rb:1508:in `postorder_traverse'", 
    "C:/Program Files/logstash/vendor/jruby/lib/ruby/stdlib/fileutils.rb:778:in `remove_entry'", 
    "C:/Program Files/logstash/vendor/jruby/lib/ruby/stdlib/fileutils.rb:698:in `remove_entry_secure'", 
    "C:/Program Files/logstash/vendor/jruby/lib/ruby/stdlib/fileutils.rb:628:in `block in rm_r'", "org/jruby/RubyArray.java:1865:in `each'", 
    "C:/Program Files/logstash/vendor/jruby/lib/ruby/stdlib/fileutils.rb:626:in `rm_r'", 
    "C:/Program Files/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-integration-aws-7.0.1/lib/logstash/outputs/s3/temporary_file.rb:54:in `delete!'", 
    "C:/Program Files/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-integration-aws-7.0.1/lib/logstash/outputs/s3.rb:383:in `clean_temporary_file'", "org/jruby/RubyMethod.java:120:in `call'", 
    "C:/Program Files/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-integration-aws-7.0.1/lib/logstash/outputs/s3/uploader.rb:60:in `upload'", 
    "C:/Program Files/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-integration-aws-7.0.1/lib/logstash/outputs/s3/uploader.rb:30:in `block in upload_async'", 
    "C:/Program Files/logstash/vendor/bundle/jruby/2.6.0/gems/concurrent-ruby-1.1.9/lib/concurrent-ruby/concurrent/executor/java_executor_service.rb:79:in `run'"]

2) File factory issue with Prefix We introduced is_deleted? method for the file factory and it becomes true when file is physically deleted. If any error happens such like FileUtils.rm_r, factory lives forever and temporary dirs stay

Steps to reproduce:

Please include a minimal but complete recreation of the problem, including (e.g.) pipeline definition(s), settings, locale, etc. The easier you make for us to reproduce it, the more likely that somebody will take the time to look at it.

Opened an issue in fileutils after long investigation: https://github.com/ruby/fileutils/issues/111

  1. Couldn't reproduce

Provide logs (if relevant):

[2023-05-25T09:49:29,729][ERROR][logstash.outputs.s3      ][main][c79604e7a5f024d0547b590e6a558875e58ac6d99c462b7ed96b7f39bb6b993b] An error occurred in the `on_complete` 
uploader {:exception=>Errno::ENOTEMPTY, :message=>"Directory not empty - D:/Logstash/temp/718dd509-c0f5-4da5-
b4b4-92608579e799", :path=>"D:\\Logstash\\temp/718dd509-c0f5-4da5-b4b4-
92608579e799/2023/05/25/ls.s3.b21c1c38-1dc2-443e-8491-847968549a30.2023-05-25T09.42.part0.txt.gz", 
:backtrace=>["org/jruby/RubyDir.java:471:in `rmdir'", "C:/Program 
Files/logstash/vendor/jruby/lib/ruby/stdlib/fileutils.rb:1459:in `block in remove_dir1'", "C:/Program 
Files/logstash/vendor/jruby/lib/ruby/stdlib/fileutils.rb:1473:in `platform_support'", "C:/Program 
Files/logstash/vendor/jruby/lib/ruby/stdlib/fileutils.rb:1458:in `remove_dir1'", "C:/Program 
Files/logstash/vendor/jruby/lib/ruby/stdlib/fileutils.rb:1451:in `remove'", "C:/Program 
Files/logstash/vendor/jruby/lib/ruby/stdlib/fileutils.rb:780:in `block in remove_entry'", "C:/Program 
Files/logstash/vendor/jruby/lib/ruby/stdlib/fileutils.rb:1508:in `postorder_traverse'", "C:/Program 
Files/logstash/vendor/jruby/lib/ruby/stdlib/fileutils.rb:778:in `remove_entry'", "C:/Program 
Files/logstash/vendor/jruby/lib/ruby/stdlib/fileutils.rb:698:in `remove_entry_secure'", "C:/Program 
Files/logstash/vendor/jruby/lib/ruby/stdlib/fileutils.rb:628:in `block in rm_r'", "org/jruby/RubyArray.java:1865:in `each'", 
"C:/Program Files/logstash/vendor/jruby/lib/ruby/stdlib/fileutils.rb:626:in `rm_r'", "C:/Program 
Files/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-integration-aws-7.0.1/lib/logstash/outputs/s3/temporary_file.rb:54:in 
`delete!'", "C:/Program Files/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-integration-aws-
7.0.1/lib/logstash/outputs/s3.rb:383:in `clean_temporary_file'", "org/jruby/RubyMethod.java:120:in `call'", "C:/Program 
Files/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-integration-aws-7.0.1/lib/logstash/outputs/s3/uploader.rb:60:in 
`upload'", "C:/Program Files/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-integration-aws-
7.0.1/lib/logstash/outputs/s3/uploader.rb:30:in `block in upload_async'", "C:/Program 
Files/logstash/vendor/bundle/jruby/2.6.0/gems/concurrent-ruby-1.1.9/lib/concurrent-
ruby/concurrent/executor/java_executor_service.rb:79:in `run'"]}
andsel commented 3 months ago

Hi @mashhurs need some information on this before digging to resolve it.

WDYT?

mashhurs commented 3 months ago

Hi @mashhurs need some information on this before digging to resolve it.

The root cause is same. With #34 we introduced retry-like temp folder deletion on LS startup which swipes out empty dirs but those empty dirs are caused by this #28 in a reality.

As I remember, this access fix was also included when I was investigating that traversing temp_file children process was always succeeded. In case if I am mistaken, we need to re-examine. This issue happens especially with multiple uploader threads and small file rotation window.

andsel commented 3 months ago

Thanks a lot for these information, could be useful to try a reproducer.