elastic / logstash

Logstash - transport and process your logs, events, or other data
https://www.elastic.co/products/logstash
Other
68 stars 3.5k forks source link

DLQ fails to start throwing IllegalArgumentException: newPosition < 0: (-1 < 0) #14918

Open ash-darin opened 1 year ago

ash-darin commented 1 year ago
  1. Logstash version: 7.17.7
  2. Logstash installation source: docker.elastic.co/logstash/logstash
  3. How is Logstash being run: docker

Plugins installed

Using bundled JDK: /usr/share/logstash/jdk
OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
logstash-codec-avro (3.3.1)
logstash-codec-cef (6.2.5)
logstash-codec-collectd (3.1.0)
logstash-codec-dots (3.0.6)
logstash-codec-edn (3.1.0)
logstash-codec-edn_lines (3.1.0)
logstash-codec-es_bulk (3.1.0)
logstash-codec-fluent (3.4.1)
logstash-codec-graphite (3.0.6)
logstash-codec-json (3.1.1)
logstash-codec-json_lines (3.1.0)
logstash-codec-line (3.1.1)
logstash-codec-msgpack (3.1.0)
logstash-codec-multiline (3.1.1)
logstash-codec-netflow (4.2.2)
logstash-codec-plain (3.1.0)
logstash-codec-rubydebug (3.1.0)
logstash-filter-aggregate (2.10.0)
logstash-filter-anonymize (3.0.6)
logstash-filter-cidr (3.1.3)
logstash-filter-clone (4.2.0)
logstash-filter-csv (3.1.1)
logstash-filter-date (3.1.15)
logstash-filter-de_dot (1.0.4)
logstash-filter-dissect (1.2.5)
logstash-filter-dns (3.1.5)
logstash-filter-drop (3.0.5)
logstash-filter-elasticsearch (3.11.1)
logstash-filter-fingerprint (3.3.2)
logstash-filter-geoip (7.2.12)
logstash-filter-grok (4.4.2)
logstash-filter-http (1.2.1)
logstash-filter-json (3.2.0)
logstash-filter-kv (4.5.0)
logstash-filter-memcached (1.1.0)
logstash-filter-metrics (4.0.7)
logstash-filter-mutate (3.5.6)
logstash-filter-prune (3.0.4)
logstash-filter-ruby (3.1.8)
logstash-filter-sleep (3.0.7)
logstash-filter-split (3.1.8)
logstash-filter-syslog_pri (3.1.1)
logstash-filter-throttle (4.0.4)
logstash-filter-translate (3.3.1)
logstash-filter-truncate (1.0.5)
logstash-filter-urldecode (3.0.6)
logstash-filter-useragent (3.3.3)
logstash-filter-uuid (3.0.5)
logstash-filter-xml (4.1.3)
logstash-input-azure_event_hubs (1.4.4)
logstash-input-beats (6.2.6)
└── logstash-input-elastic_agent (alias)
logstash-input-couchdb_changes (3.1.6)
logstash-input-dead_letter_queue (1.1.12)
logstash-input-elasticsearch (4.12.3)
logstash-input-exec (3.4.0)
logstash-input-file (4.4.4)
logstash-input-ganglia (3.1.4)
logstash-input-gelf (3.3.2)
logstash-input-generator (3.1.0)
logstash-input-graphite (3.0.6)
logstash-input-heartbeat (3.1.1)
logstash-input-http (3.4.5)
logstash-input-http_poller (5.1.0)
logstash-input-imap (3.2.0)
logstash-input-jms (3.2.2)
logstash-input-pipe (3.1.0)
logstash-input-redis (3.7.0)
logstash-input-s3 (3.8.3)
logstash-input-snmp (1.3.1)
logstash-input-snmptrap (3.1.0)
logstash-input-sqs (3.1.3)
logstash-input-stdin (3.4.0)
logstash-input-syslog (3.6.0)
logstash-input-tcp (6.2.7)
logstash-input-twitter (4.1.0)
logstash-input-udp (3.5.0)
logstash-input-unix (3.1.2)
logstash-integration-elastic_enterprise_search (2.1.2)
├── logstash-output-elastic_app_search
└──  logstash-output-elastic_workplace_search
logstash-integration-jdbc (5.2.6)
├── logstash-input-jdbc
├── logstash-filter-jdbc_streaming
└── logstash-filter-jdbc_static
logstash-integration-kafka (10.9.0)
├── logstash-input-kafka
└── logstash-output-kafka
logstash-integration-rabbitmq (7.3.1)
├── logstash-input-rabbitmq
└── logstash-output-rabbitmq
logstash-output-cloudwatch (3.0.10)
logstash-output-csv (3.0.8)
logstash-output-elasticsearch (11.4.1)
logstash-output-email (4.1.1)
logstash-output-file (4.3.0)
logstash-output-graphite (3.1.6)
logstash-output-http (5.2.5)
logstash-output-lumberjack (3.1.9)
logstash-output-nagios (3.0.6)
logstash-output-null (3.0.5)
logstash-output-pipe (3.0.6)
logstash-output-redis (5.0.0)
logstash-output-s3 (4.3.7)
logstash-output-sns (4.0.8)
logstash-output-sqs (6.0.0)
logstash-output-stdout (3.1.4)
logstash-output-tcp (6.0.2)
logstash-output-udp (3.2.0)
logstash-output-webhdfs (3.0.6)
logstash-patterns-core (4.3.4)

JVM

provided by container "Using bundled JDK: /usr/share/logstash/jdk" "jruby.version"=>"jruby 9.2.20.1 (2.5.8) 2021-11-30 2a2962fbd1 OpenJDK 64-Bit Server VM 11.0.16+8 on 11.0.16+8 +indy +jit [linux-x86_64]"

OS version

Linux testmachine 4.19.0-23-amd64 #1 SMP Debian 4.19.269-1 (2022-12-20) x86_64 GNU/Linux

Description of the problem including expected versus actual behavior

Logstash is unable to start throwing java.lang.IllegalArgumentException: newPosition < 0: (-1 < 0)

logstash fails at startup with the following stack-trace. Please note, this can either not be the same as #14605 or the backport #14752 did not address this correctly . This fails with a different stack trace and still occurs in 7.17.7

Steps to reproduce

Take the files from #14605 (attached ) with this config:

dlq_sincedb_log.zip

input {
    dead_letter_queue {
        path => "/usr/share/logstash/data/dead_letter_queue/"
        start_timestamp => "2020-11-25T13:50:00"
        pipeline_id => "test_pipeline"
    }
}
output {
    stdout {}
}

copy the sincedb (with a preceding dot) file and the 1 byte files to /usr/share/logstash/data/dead_letter_queue/ (from the viewpoint of the container)

Provide logs (if relevant)

[2023-02-27T16:26:43,659][ERROR][logstash.javapipeline ][dead_letter_queue_valve] Pipeline error {:pipeline_id=>"dead_letter_queue_valve", :exception=>java.lang.IllegalArgumentException: newPosition < 0: (-1 < 0), :backtrace=>["java.nio.Buffer.createPositionException(java/nio/Buffer.java:318)", "java.nio.Buffer.position(java/nio/Buffer.java:293)", "java.nio.ByteBuffer.position(java/nio/ByteBuffer.java:1094)", "org.logstash.common.io.RecordIOReader.consumeBlock(org/logstash/common/io/RecordIOReader.java:184)", "org.logstash.common.io.RecordIOReader.consumeToStartOfEvent(org/logstash/common/io/RecordIOReader.java:238)", "org.logstash.common.io.RecordIOReader.readEvent(org/logstash/common/io/RecordIOReader.java:282)", "org.logstash.common.io.RecordIOReader.seekToNextEventPosition(org/logstash/common/io/RecordIOReader.java:156)", "org.logstash.common.io.DeadLetterQueueReader.seekToNextEvent(org/logstash/common/io/DeadLetterQueueReader.java:87)", "org.logstash.input.DeadLetterQueueInputPlugin.setInitialReaderState(org/logstash/input/DeadLetterQueueInputPlugin.java:95)", "org.logstash.input.DeadLetterQueueInputPlugin.lazyInitQueueReader(org/logstash/input/DeadLetterQueueInputPlugin.java:74)", "org.logstash.input.DeadLetterQueueInputPlugin.register(org/logstash/input/DeadLetterQueueInputPlugin.java:81)", "jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)", "jdk.internal.reflect.NativeMethodAccessorImpl.invoke(jdk/internal/reflect/NativeMethodAccessorImpl.java:62)", "jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(jdk/internal/reflect/DelegatingMethodAccessorImpl.java:43)", "java.lang.reflect.Method.invoke(java/lang/reflect/Method.java:566)", "org.jruby.javasupport.JavaMethod.invokeDirectWithExceptionHandling(org/jruby/javasupport/JavaMethod.java:441)", "org.jruby.javasupport.JavaMethod.invokeDirect(org/jruby/javasupport/JavaMethod.java:305)", "usr.share.logstash.vendor.bundle.jruby.$2_dot_5_dot_0.gems.logstash_minus_input_minus_dead_letter_queue_minus_1_dot_1_dot_12.lib.logstash.inputs.dead_letter_queue.register(/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-input-dead_letter_queue-1.1.12/lib/logstash/inputs/dead_letter_queue.rb:57)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.register_plugins(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:233)", "org.jruby.RubyArray.each(org/jruby/RubyArray.java:1821)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.register_plugins(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:232)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.start_inputs(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:391)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.start_workers(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:316)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.run(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:190)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.start(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:142)", "org.jruby.RubyProc.call(org/jruby/RubyProc.java:318)", "java.lang.Thread.run(java/lang/Thread.java:829)"], "pipeline.sources"=>["/usr/share/logstash/pipelines/dead_letter_queue_valve/active/100-input-dlq.config", "/usr/share/logstash/pipelines/dead_letter_queue_valve/active/101-filter-drop-events-older-than-35d.config", "/usr/share/logstash/pipelines/dead_letter_queue_valve/active/102-test-case.config", "/usr/share/logstash/pipelines/dead_letter_queue_valve/active/200-filter-extract-fields.config", "/usr/share/logstash/pipelines/dead_letter_queue_valve/active/201-filter-add-dlq-event-hash-for-jdbc-common-out.config", "/usr/share/logstash/pipelines/dead_letter_queue_valve/active/300-output-flatfile.config", "/usr/share/logstash/pipelines/dead_letter_queue_valve/active/301-output-elastic-siem.config"], :thread=>"#<Thread:0x249baad4 run>"}

yaauie commented 1 year ago

Please note, this can either not be the same as #14605 or the backport #14752 did not address this correctly . This fails with a different stack trace and still occurs in 7.17.7

The referenced 7.x backport is tagged with 7.17.8, indicating that it is not expected to have been available to the 7.17.7 reported here. Additionally, the stack-trace posted on this issue aligns with the stack-trace on #14599 which is where the issue was first reported. I would advise upgrading to 7.17.8 to consume the fix.

An in-place remediation for <= 7.17.7 would be to backup the DLQ directory, delete any .log files in the DLQ's data directory that are exactly one byte in size, and then restart the associated pipeline.

ash-darin commented 1 year ago

Hello,

I'm sorry, I seem to have posted the wrong stack trace. I tested this on 7.17.8 and it fails there too. I have now reproduced this with an old copy of the dead_letter_queue directory instead of the files from #14605 and posted the stacktrace below. 7.17.7 is the state of the test environment that I use as a staging area for my upgrade, that's why I still refered that version as this problem plagues me since late October. Thank you for clearing up btw that a fix is not widely available until the version tag is added to the issue.

Stack Trace from 7.17.8 below

Starting Logstash {"logstash.version"=>"7.17.8", "jruby.version"=>"jruby 9.2.20.1 (2.5.8) 2021-11-30 2a2962fbd1 OpenJDK 64-Bit Server VM 11.0.17+8 on 11.0.17+8 +indy +jit [linux-x86_64]"}

[2023-02-28T09:08:36,578][ERROR][logstash.javapipeline    ][dead_letter_queue_valve][ebd78db2aa8c043ce909e8eeafc66cd3f13aae8cb3f9deba86b361e4b0013eba] A plugin had an unrecoverable error. Will restart this plugin.
  Pipeline_id:dead_letter_queue_valve
  Plugin: <LogStash::Inputs::DeadLetterQueue pipeline_id=>"oam_filebeat_7x_in", path=>"/usr/share/logstash/data/dead_letter_queue/", add_field=>{"[sl][dlq][pipeline]"=>"oam_filebeat_7x_in"}, id=>"ebd78db2aa8c043ce909e8eeafc66cd3f13aae8cb3f9deba86b361e4b0013eba", commit_offsets=>true, enable_metric=>true, codec=><LogStash::Codecs::Plain id=>"plain_eb311f2b-fa0e-4ca3-ad16-79d35f4c503c", enable_metric=>true, charset=>"UTF-8">>
  Error: newPosition < 0: (-1 < 0)
  Exception: Java::JavaLang::IllegalArgumentException
  Stack: java.nio.Buffer.createPositionException(java/nio/Buffer.java:318)
java.nio.Buffer.position(java/nio/Buffer.java:293)
java.nio.ByteBuffer.position(java/nio/ByteBuffer.java:1094)
org.logstash.common.io.RecordIOReader.consumeBlock(org/logstash/common/io/RecordIOReader.java:184)
org.logstash.common.io.RecordIOReader.consumeToStartOfEvent(org/logstash/common/io/RecordIOReader.java:238)
org.logstash.common.io.RecordIOReader.readEvent(org/logstash/common/io/RecordIOReader.java:282)
org.logstash.common.io.DeadLetterQueueReader.pollEntryBytes(org/logstash/common/io/DeadLetterQueueReader.java:144)
org.logstash.common.io.DeadLetterQueueReader.pollEntry(org/logstash/common/io/DeadLetterQueueReader.java:121)
org.logstash.input.DeadLetterQueueInputPlugin.run(org/logstash/input/DeadLetterQueueInputPlugin.java:104)
jdk.internal.reflect.GeneratedMethodAccessor179.invoke(jdk/internal/reflect/GeneratedMethodAccessor179)
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(jdk/internal/reflect/DelegatingMethodAccessorImpl.java:43)
java.lang.reflect.Method.invoke(java/lang/reflect/Method.java:566)
org.jruby.javasupport.JavaMethod.invokeDirectWithExceptionHandling(org/jruby/javasupport/JavaMethod.java:456)
org.jruby.javasupport.JavaMethod.invokeDirect(org/jruby/javasupport/JavaMethod.java:317)
usr.share.logstash.vendor.bundle.jruby.$2_dot_5_dot_0.gems.logstash_minus_input_minus_dead_letter_queue_minus_1_dot_1_dot_12.lib.logstash.inputs.dead_letter_queue.run(/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-input-dead_letter_queue-1.1.12/lib/logstash/inputs/dead_letter_queue.rb:74)
usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.inputworker(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:410)
usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.start_input(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:401)
org.jruby.RubyProc.call(org/jruby/RubyProc.java:318)
java.lang.Thread.run(java/lang/Thread.java:829)

This stack trace runs though this:

org.logstash.common.io.DeadLetterQueueReader.pollEntryBytes(org/logstash/common/io/DeadLetterQueueReader.java:144)
org.logstash.common.io.DeadLetterQueueReader.pollEntry(org/logstash/common/io/DeadLetterQueueReader.java:121)

Which the other stacktrace does not. I can not tell how significant this difference is though.

P.S.: Should I edit the original issue to correct my error and reflect this?

ash-darin commented 1 year ago

An in-place remediation for <= 7.17.7 would be to backup the DLQ directory, delete any .log files in the DLQ's data directory that are exactly one byte in size, and then restart the associated pipeline.

I got told about this workaround from another person too but as this does not seem to prevent the issue from reocurring I am reluctant to do this. Is there any information if this workaround prevents the problem state from being reached?

ash-darin commented 1 year ago

A difference to #14605 is that the sincedb file from that ticket points to

/var/lib/logstash/dead_letter_queue/live-cf-filebeat/170033.log

root@test01:~/unpack# hexdump -C sincedb_1dcdb29e71c4b6a0ec8e39a1d1757aa3
00000000  00 31 00 00 00 3f 2f 76  61 72 2f 6c 69 62 2f 6c  |.1...?/var/lib/l|
00000010  6f 67 73 74 61 73 68 2f  64 65 61 64 5f 6c 65 74  |ogstash/dead_let|
00000020  74 65 72 5f 71 75 65 75  65 2f 6c 69 76 65 2d 63  |ter_queue/live-c|
00000030  66 2d 66 69 6c 65 62 65  61 74 2f 31 37 30 30 33  |f-filebeat/17003|
00000040  33 2e 6c 6f 67 00 00 00  00 00 00 13 6d 00 00 00  |3.log.......m...|
00000050  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000080

That path is not the suggested /LOGSTASH_HOME/data/dead_letter_queue/test_pipeline and will most likely not exist on any system where this was reproduced but the error was triggered anyway. The sincedb files on my system do actually point to files existing in my filesystem.

root@test01:/usr/share/logstash/data/plugins/inputs/dead_letter_queue/oam_filebeat_7x_in# hexdump -C .sincedb_78cbb9a3be93df496ba2a807934c836d
00000000  00 31 00 00 00 43 2f 75  73 72 2f 73 68 61 72 65  |.1...C/usr/share|
00000010  2f 6c 6f 67 73 74 61 73  68 2f 64 61 74 61 2f 64  |/logstash/data/d|
00000020  65 61 64 5f 6c 65 74 74  65 72 5f 71 75 65 75 65  |ead_letter_queue|
00000030  2f 6f 61 6d 5f 66 69 6c  65 62 65 61 74 5f 37 78  |/oam_filebeat_7x|
00000040  5f 69 6e 2f 37 2e 6c 6f  67 00 00 00 00 00 00 00  |_in/7.log.......|
00000050  01 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000060  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000084

As I can not simply edit a sincedb file, I can not provide you with a file to reproduce this.

It would be helpful to know under which conditions these empty 1.log / 2.log etc. files are actually created. Can somebody elaborate?

andsel commented 1 year ago

It would be helpful to know under which conditions these empty 1.log / 2.log etc. files are actually created

Empty files shouldn't be created. DLQ creates a <index>.log.tmp file which is the actual head segment of the DLQ, and only the upstream pipeline can write into it. When it's created, it contains just the version (1 byte). It can grow if some data comes. When it's finalized, it is renamed removing the .tmp extension, and is considered valid to be consumed by the downstream pipeline.

If a head segment has received some writes, so it size is greater than 1 byte, it could be finalized if:

Maybe 1 byte segments originated before the PR #12304 fixed with 7.10.0.

I tried to reproduce your issue, after having created a couple of segments with an upstream pipeline, something simple that write to a closed index

input {
  stdin {
    codec => json
  }
}

output {
  elasticsearch {
    index => "test_index"
    hosts => "http://localhost:9200"
    user => "elastic"
    password => "secret"
  }
}

How to close the index?

POST test_index/_close

I used the sincedb_1dcdb29e71c4b6a0ec8e39a1d1757aa3 present in dlq_sincedb_log.zip, and a pipeline like:

input {
    dead_letter_queue {
        sincedb_path => "/path/to/sincedb_1dcdb29e71c4b6a0ec8e39a1d1757aa3"
        path => "/path/to/logstash/data/dead_letter_queue/"
        pipeline_id => "main"
    }
}
output {
    stdout {}
}

In this case, the path contained in sincedb file is:

/var/lib/logstash/dead_letter_queue/live-cf-filebeat/170033.log

which in local filesystem doesn't exists. So the DLQ reader setCurrentReaderAndPosition("/var/lib/logstash/dead_letter_queue/live-cf-filebeat/170033.log") if the file doesn't exists brings the first segment: https://github.com/elastic/logstash/blob/v7.17.8/logstash-core/src/main/java/org/logstash/common/io/DeadLetterQueueReader.java#L166-L169.

The ordering is done by segment id, in this case I think you have 7 segments in your /usr/share/logstash/data/dead_letter_queue/oam_filebeat_7x_in/, so it checks just which is the segments with an index greater than 170033, none, so it starts polling data from the first segment present in the DLQ folder.

Please could you provide a listing of your DLQ segments?

ls -lart /usr/share/logstash/data/dead_letter_queue/oam_filebeat_7x_in

Also some debug logs for DLQ when this is happening would be helpful to try to understand what's happening.

andsel commented 1 year ago

The difference between the two stack traces, is that the one provided in #14599

java.nio.Buffer.position(java/nio/Buffer.java:293)
java.nio.ByteBuffer.position(java/nio/ByteBuffer.java:1094)
java.nio.ByteBuffer.position(java/nio/ByteBuffer.java:262)
org.logstash.common.io.RecordIOReader.consumeBlock(org/logstash/common/io/RecordIOReader.java:184)
org.logstash.common.io.RecordIOReader.consumeToStartOfEvent(org/logstash/common/io/RecordIOReader.java:238)
org.logstash.common.io.RecordIOReader.readEvent(org/logstash/common/io/RecordIOReader.java:282)
org.logstash.common.io.RecordIOReader.seekToNextEventPosition(org/logstash/common/io/RecordIOReader.java:156)
org.logstash.common.io.DeadLetterQueueReader.seekToNextEvent(org/logstash/common/io/DeadLetterQueueReader.java:87)
org.logstash.input.DeadLetterQueueInputPlugin.setInitialReaderState(org/logstash/input/DeadLetterQueueInputPlugin.java:95)
org.logstash.input.DeadLetterQueueInputPlugin.lazyInitQueueReader(org/logstash/input/DeadLetterQueueInputPlugin.java:74)
org.logstash.input.DeadLetterQueueInputPlugin.register(org/logstash/input/DeadLetterQueueInputPlugin.java:81)

started in DLQ input register phase, in particular it load events from a timestamp. This happen when the setting start_timestamp is used. In this case

java.nio.Buffer.createPositionException(java/nio/Buffer.java:318)
java.nio.Buffer.position(java/nio/Buffer.java:293)
java.nio.ByteBuffer.position(java/nio/ByteBuffer.java:1094)
org.logstash.common.io.RecordIOReader.consumeBlock(org/logstash/common/io/RecordIOReader.java:184)
org.logstash.common.io.RecordIOReader.consumeToStartOfEvent(org/logstash/common/io/RecordIOReader.java:238)
org.logstash.common.io.RecordIOReader.readEvent(org/logstash/common/io/RecordIOReader.java:282)
org.logstash.common.io.DeadLetterQueueReader.pollEntryBytes(org/logstash/common/io/DeadLetterQueueReader.java:144)
org.logstash.common.io.DeadLetterQueueReader.pollEntry(org/logstash/common/io/DeadLetterQueueReader.java:121)
org.logstash.input.DeadLetterQueueInputPlugin.run(org/logstash/input/DeadLetterQueueInputPlugin.java:104)

it happens when a reference from sincedb file is used.

https://github.com/logstash-plugins/logstash-input-dead_letter_queue/blob/252cbcccab88ba259d22cac6ea31b9b9e4856507/src/main/java/org/logstash/input/DeadLetterQueueInputPlugin.java#L119-L130

ash-darin commented 1 year ago

Please could you provide a listing of your DLQ segments?

ls -lart /usr/share/logstash/data/dead_letter_queue/oam_filebeat_7x_in

Here is a (shortened) output of the directory:

logstash@prod-system:~$ ls -lart /usr/share/logstash/data/dead_letter_queue/oam_filebeat_7x_in
total 6360
-rw-r--r--  1 logstash logstash     1 Feb 14  2022 1.log
-rw-r--r--  1 logstash logstash     1 Feb 17  2022 2.log
-rw-r--r--  1 logstash logstash     1 Feb 21  2022 3.log
-rw-r--r--  1 logstash logstash     1 Feb 21  2022 4.log
-rw-r--r--  1 logstash logstash     1 Feb 21  2022 5.log
-rw-r--r--  1 logstash logstash     1 Feb 21  2022 6.log
-rw-r--r--  1 logstash logstash     1 Feb 21  2022 7.log
-rw-r--r--  1 logstash logstash     1 Feb 22  2022 8.log
-rw-r--r--  1 logstash logstash     1 Feb 22  2022 9.log
[ 1567 more lines of this ]
-rw-r--r--  1 logstash logstash     1 Sep 28  2022 1576.log
-rw-r--r--  1 logstash logstash     1 Sep 28  2022 1577.log
-rw-r--r--  1 logstash logstash     1 Sep 28  2022 1578.log
drwxr-xr-x 20 logstash logstash  4096 Dec  9 11:46 ..
-rw-r--r--  1 logstash logstash     0 May 22 17:47 .lock
-rw-r--r--  1 logstash logstash     1 May 22 17:47 1579.log.tmp
drwxr-xr-x  2 logstash logstash 36864 May 22 17:47 .

I hope this helps?

andsel commented 1 year ago

I've tried to reproduce locally with fresh installation of 7.17.8 and creating empty segment files, and the reader doesn't crash.

Test setup

ash-darin commented 1 year ago

The files that script creates bear no resemblance to the files I have. My files:

hexdump -C 2.log
00000000  31                                                |1|
00000001

Your files:

hexdump -C 2.log 
00000000  01                                                |.|
00000001

I thereby didn't check if these files pose any problems in my setup.

I retested with 7.17.10 and got this stack trace (names slightly editied as it is from a customer):

[2023-06-15T12:15:26,025][ERROR][logstash.javapipeline    ][dead_letter_queue_valve][0b80bf3de007ce3172abef211e15b38452ffe788303f7411b801cf4b2d02cfab] A plugin had an unrecoverable error. Will restart this plugin.
  Pipeline_id:dead_letter_queue_valve
  Plugin: <LogStash::Inputs::DeadLetterQueue pipeline_id=>"oam_test_filebeat_in", path=>"/usr/share/logstash/data/dead_letter_queue/", add_field=>{"[dlq][pipeline]"=>"oam_test_filebeat_in"}, id=>"0b80bf3de007ce3172abef211e15b38452ffe788303f7411b801cf4b2d02abfc", commit_offsets=>true, enable_metric=>true, codec=><LogStash::Codecs::Plain id=>"plain_0f52d278-6529-4319-b6d5-b78ab5a33f21", enable_metric=>true, charset=>"UTF-8">>
  Error:
  Exception: Java::JavaNio::InvalidMarkException
  Stack: java.nio.Buffer.reset(java/nio/Buffer.java:399)
java.nio.ByteBuffer.reset(java/nio/ByteBuffer.java:1133)
org.logstash.common.io.RecordIOReader.consumeBlock(org/logstash/common/io/RecordIOReader.java:188)
org.logstash.common.io.RecordIOReader.consumeToStartOfEvent(org/logstash/common/io/RecordIOReader.java:238)
org.logstash.common.io.RecordIOReader.readEvent(org/logstash/common/io/RecordIOReader.java:282)
org.logstash.common.io.DeadLetterQueueReader.pollEntryBytes(org/logstash/common/io/DeadLetterQueueReader.java:144)
org.logstash.common.io.DeadLetterQueueReader.pollEntry(org/logstash/common/io/DeadLetterQueueReader.java:121)
org.logstash.input.DeadLetterQueueInputPlugin.run(org/logstash/input/DeadLetterQueueInputPlugin.java:104)
jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
jdk.internal.reflect.NativeMethodAccessorImpl.invoke(jdk/internal/reflect/NativeMethodAccessorImpl.java:62)
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(jdk/internal/reflect/DelegatingMethodAccessorImpl.java:43)
java.lang.reflect.Method.invoke(java/lang/reflect/Method.java:566)
org.jruby.javasupport.JavaMethod.invokeDirectWithExceptionHandling(org/jruby/javasupport/JavaMethod.java:426)
org.jruby.javasupport.JavaMethod.invokeDirect(org/jruby/javasupport/JavaMethod.java:293)
usr.share.logstash.vendor.bundle.jruby.$2_dot_5_dot_0.gems.logstash_minus_input_minus_dead_letter_queue_minus_1_dot_1_dot_12.lib.logstash.inputs.dead_letter_queue.run(/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-input-dead_letter_queue-1.1.12/lib/logstash/inputs/dead_letter_queue.rb:74)
usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.inputworker(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:410)
usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.start_input(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:401)
org.jruby.RubyProc.call(org/jruby/RubyProc.java:318)
java.lang.Thread.run(java/lang/Thread.java:829)
andsel commented 1 year ago

The files that script creates bear no resemblance to the files I have

My bad, you are right, DLQ saves the 1 char value and not the 1 integer value:

https://github.com/elastic/logstash/blob/c187158b58c927599a7df6b410f95b4a9cc601f2/logstash-core/src/main/java/org/logstash/common/io/RecordIOWriter.java#L110

I retested with 7.17.10 and got this stack trace ...

This is a different stack trace, it contains a InvalidMarkExceptioninstead of the original IllegalArgumentException. I'll give it a look

andsel commented 1 year ago

I've updated the gist to create the segment files with the 1 character instead of the number.

ash-darin commented 1 year ago

The files that script creates bear no resemblance to the files I have

My bad, you are right, DLQ saves the 1 char value and not the 1 integer value:

https://github.com/elastic/logstash/blob/c187158b58c927599a7df6b410f95b4a9cc601f2/logstash-core/src/main/java/org/logstash/common/io/RecordIOWriter.java#L110

I retested with 7.17.10 and got this stack trace ...

This is a different stack trace, it contains a InvalidMarkExceptioninstead of the original IllegalArgumentException. I'll give it a look

Thank you for looking into it, unfortunately my knowledge of Java is too limited to debug this myself. It is really appreciated.