logstash-plugins / logstash-integration-aws

Apache License 2.0
7 stars 17 forks source link

Remove leftover empty files when closing the plugin. #46

Closed mashhurs closed 2 months ago

mashhurs commented 2 months ago

Release notes

[rn:skip]

What does this PR do?

When closing the plugin, the plugin checks if file needs to be uploaded to S3. If file is empty, it will be removed (without this change the file will be left).

Why is it important/What is the impact to the user?

For intensive S3 upload operations and rotation policy is fast (size or time), multiple files will be created. When the plugin is being closed, created but not written to files will be just left. There will be many empty files over time. Note that, we introduced separate logic to remove the empty dirs but it doesn't remove if dir has files, even empty files.

Checklist

Author's Checklist

How to test this PR locally

Related issues

Use cases

Screenshots

Logs

PS C:\ls> .\bin\logstash -f .\config\s3.config.txt
"Using bundled JDK: C:\ls\jdk\bin\java.exe"
Sending Logstash logs to C:/ls/logs which is now configured via log4j2.properties
[2024-06-21T22:16:00,584][INFO ][logstash.runner          ] Log4j configuration path used is: C:\ls\config\log4j2.properties
[2024-06-21T22:16:00,597][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"8.15.0", "jruby.version"=>"jruby 9.4.7.0 (3.1.4) 2024-04-29 597ff08ac1 OpenJDK 64-Bit Server VM 21.0.3+9-LTS on 21.0.3+9-LTS +indy +jit [x86_64-mswin32]"}
[2024-06-21T22:16:00,597][INFO ][logstash.runner          ] JVM bootstrap flags: [-Xms1g, -Xmx1g, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djruby.compile.invokedynamic=true, -XX:+HeapDumpOnOutOfMemoryError, -Djava.security.egd=file:/dev/urandom, -Dlog4j2.isThreadContextMapInheritable=true, -Dlogstash.jackson.stream-read-constraints.max-string-length=200000000, -Dlogstash.jackson.stream-read-constraints.max-number-length=10000, -Djruby.regexp.interruptible=true, -Djdk.io.File.enableADS=true, --add-exports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.parser=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED, --add-opens=java.base/java.security=ALL-UNNAMED, --add-opens=java.base/java.io=ALL-UNNAMED, --add-opens=java.base/java.nio.channels=ALL-UNNAMED, --add-opens=java.base/sun.nio.ch=ALL-UNNAMED, --add-opens=java.management/sun.management=ALL-UNNAMED, -Dio.netty.allocator.maxOrder=11]
[2024-06-21T22:16:00,597][INFO ][logstash.runner          ] Jackson default value override `logstash.jackson.stream-read-constraints.max-string-length` configured to `200000000`
[2024-06-21T22:16:00,597][INFO ][logstash.runner          ] Jackson default value override `logstash.jackson.stream-read-constraints.max-number-length` configured to `10000`
[2024-06-21T22:16:00,678][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2024-06-21T22:16:02,119][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600, :ssl_enabled=>false}
[2024-06-21T22:16:02,913][INFO ][org.reflections.Reflections] Reflections took 175 ms to scan 1 urls, producing 138 keys and 481 values
[2024-06-21T22:16:03,062][INFO ][logstash.codecs.json     ] ECS compatibility is enabled but `target` option was not specified. This may cause fields to be set at the top-level of the event where they are likely to clash with the Elastic Common Schema. It is recommended to set the `target` option to avoid potential schema conflicts (if your data is ECS compliant or non-conflicting, feel free to ignore this message)
[2024-06-21T22:16:06,613][INFO ][logstash.codecs.jsonlines] ECS compatibility is enabled but `target` option was not specified. This may cause fields to be set at the top-level of the event where they are likely to clash with the Elastic Common Schema. It is recommended to set the `target` option to avoid potential schema conflicts (if your data is ECS compliant or non-conflicting, feel free to ignore this message)
[2024-06-21T22:16:08,726][INFO ][logstash.javapipeline    ] Pipeline `main` is configured with `pipeline.ecs_compatibility: v8` setting. All plugins in this pipeline will default to `ecs_compatibility => v8` unless explicitly configured otherwise.
[2024-06-21T22:16:09,025][INFO ][logstash.javapipeline    ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>8, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>1000, "pipeline.sources"=>["C:/ls/config/s3.config.txt"], :thread=>"#<Thread:0x4f976cae C:/ls/logstash-core/lib/logstash/java_pipeline.rb:134 run>"}
[2024-06-21T22:16:10,164][INFO ][logstash.javapipeline    ][main] Pipeline Java execution initialization time {"seconds"=>1.14}
[2024-06-21T22:16:10,171][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
[2024-06-21T22:16:10,190][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2024-06-21T22:16:15,688][INFO ][logstash.javapipeline    ][main] Pipeline terminated {"pipeline.id"=>"main"}
[2024-06-21T22:16:15,848][INFO ][logstash.pipelinesregistry] Removed pipeline from registry successfully {:pipeline_id=>:main}
[2024-06-21T22:16:15,860][INFO ][logstash.runner          ] Logstash shut down.
PS C:\ls>
mashhurs commented 2 months ago

Failed CI will be 🟢 with 8.14.2 soon. https://github.com/elastic/logstash/pull/16255 unit test is a safeguard for such failure cases.

mashhurs commented 2 months ago

LGTM, just minor suggestions:

  • How about adding a debug log here, just in case we need troubleshooting those temporary file deletions.

I once more thought and moved the removal logic in the plugin#close. In clean_temporary_file there are debugging log lines and LS core has also debugging log before closing the plugin as I remember.

  • Could we add a test to ensure those empty files are deleted when the close is called?

Nice, I have just added it.