Open ccurzon opened 8 years ago
I hit the same situation. Mentioned the wrong since_db path in the configuration of s3 plugin and logstash pipeline into infinite loop. It looks to me a simple fix on validating the since_db path before attempting to load the files from s3.
BEHAVIOR
So I'm using Logstash to load a csv file having 4998079 records in it. As it is loading I'm tracking the count in Elasticsearch and I notice that the count goes well beyond 5M and keeps climbing.
Then I notice that the console shows the following message:
A plugin had an unrecoverable error. Will restart this plugin.
Plugin: <LogStash::Inputs::File path=>["/home/zed/logstash-2.1.0/source_data/nga/AIS_Ships_PD-3874.csv"], sincedb_path=>"/home/zed/logstash-2.1.0/sincedb_path/AIS_Ships_PD-3874.since", type=>"core2", start_position=>"beginning", codec=><LogStash::Codecs::Plain charset=>"UTF-8">, stat_interval=>1, discover_interval=>15, sincedb_write_interval=>15, delimiter=>"\n"> Error: No such file or directory - /home/zed/logstash-2.1.0/sincedb_path/AIS_Ships_PD-3874.since.14300.17459.522421 {:level=>:error}
Which explains that the load process has restarted from the beginning of the file, and that is why the record count continues to climb.
So out of curiosity, I wait to see if it will halt at the end of the second load. But ES issues the same fatal error message, and then restarts the data load, running it for the third time. So I killed the logstash process.
The problem, which is clear from the error message, is that the specified sincedb_path does not exist. (Through a typo "sample_date" was used in the sincedb_path where "sample_data" would have been correct.)
The --configtest flag did not catch this.
TEST CASE
input { file { path => "/home/zed/logstash-2.1.0/source_data/gdelt/20151124.export.CSV" sincedb_path => "/homex/zed/logstash-2.1.0/load_conf/gdelt.since" # erroneous /homex/ does not exist type => "gdelt" start_position => "beginning" } }
The path "/homex" is intentionally made wrong in order to check the --configtest flag. Note, on this configuration, --configtest reported that the configuration was OK, even though the path did not exist!
CONSEQUENCE
1) All 4998079 records in csv file are loaded and then status processing happens.
2) Apparently, Logstash then tries to write a status to the sincedb file, but fails and emits a fatal error to the console like
A plugin had an unrecoverable error. Will restart this plugin. Plugin: <LogStash::Inputs::File path=>["/home/zed/logstash-2.1.0/source_data/nga/AIS_Ships_PD-3874.csv"], sincedb_path=>"/home/zed/logstash-2.1.0/sincedb_path/AIS_Ships_PD-3874.since", type=>"core2", start_position=>"beginning", codec=><LogStash::Codecs::Plain charset=>"UTF-8">, stat_interval=>1, discover_interval=>15, sincedb_write_interval=>15, delimiter=>"\n"> Error: No such file or directory - /home/zed/logstash-2.1.0/sincedb_path/AIS_Ships_P
3) After failing to write to the sincedb file, Logstash reports a fatal error and restarts the data load.
4) the cycle repeats, starting at step 1, resulting in an additional 4998079 records, etc. An infinite loop.
SUGGESTED RESOLUTION
1) logstash should touch the file before starting the load, in order to create it and verify permissions. 2) --configtest flag should verify the sincedb value as part of its test.
-- Chris Curzon