Closed jerome83136 closed 2 years ago
Hi,
I suspect node-logstash to start reading files from the beginning but to stop reading the file after the last line of the file at the node-logstash startup.
Example: My log file is 10 lines long. I start node-logstash which reads from the 1st line to the line number 10 and stops, Even if a new line number 11 gets reated after node-logstash startup.
--> verified
This is what happens. I have started node-logstash (.dbfile deleted) and the last lines of logs were time-stamped 2016/06/23 09h50
The output files written by node-logstash (file output plugin) are filled up with the content of the input files from their beginning and the last lines are time-stamped 2016/06/23 10h50
I have another instance of node-logstash; which reads the same input files and send them to Elasticsearch and in my index I also have no logs after 2016/06/23 10h50
Thanks for your help
Jérôme
I'm not able to reproduce any problem :(
My test config :
input {
file {
#use_tail => false
start_index => 0
path => 'toto.log'
}
}
output {
stdout {
codec => json
}
file {
path => out.log
}
}
With or without start_index, with or without db_file, all seems to be OK. I run test on a Linux Debian 7, node 4.1.1
I also did some test with Apache2 log rotation (to be exact with logrotate config deployed by the debian package) : all seems to works as expected.
Can you provide more details
Hello,
Thank you for your investigations.
I'm using CentOS 7.0 x86_64 node-logstash version: 0.0.5
I use that command line:
/products/node-logstash/bin/node-logstash-agent --log_level=error --config_file=/conf/logstash/logstash.app-es.node.conf --log_file=/logs/logstash/logs.node-logstash.app-es.log --db_file=/var/tmp/logs.node-logstash.app-es.dbfile
input {
file {
#use_tail => true
start_index => 0
path => '/central_logs/input/prod/webservers/webserver1/apache/app/access_FH?_log'
add_field => { "application" => "app" }
}
file {
#use_tail => true
start_index => 0
path => '/central_logs/input/prod/webservers/webserver1/apache/app2/access_*MALE_log'
add_field => { "application" => "app2" }
}
file {
#use_tail => true
start_index => 0
path => '/central_logs/input/prod/webservers/webserver2/apache/app/access_FH?_log'
add_field => { "application" => "app" }
}
file {
#use_tail => true
start_index => 0
path => '/central_logs/input/prod/webservers/webserver2/apache/app2/access_*MALE_log'
add_field => { "application" => "app2" }
}
}
filter {
regex {
regex => /([0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3})\s(.*)\s(.*)\s\[.*\]\s\"([A-Z]*)\s\/(webshop\/.*)\sHTTP\/[0-9].[0-9]\"\s{1,2}([0-9]{3})\s([0-9]*|-)\s\|(.*)\|\s\{(.*)\}\s([0-9]*|-)\s(zp2web0[0-9]|-)\s\+(lt[m,f][a,b,c][0-9]*xz[0-9]*wty|-)\_\_[-,1,0]\+\s\(([A-Z0-9]*\.lt[m,f][a,b,c][0-9]*xz[0-9]*wty|-)\)\s\<(.*)\>/
fields => [clientip, user_http, user_app, timestamp, method, request, http_code, http_lenght, referer, user_agent, http_time, webserver, jvm, jsessionid, ssl_version]
numerical_fields => [http_code, http_lenght, http_time]
date_format => ['dd/MMM/yyyy:HH:mm:ss ZZ']
}
#Getting GeoIP information for the event
geoip {
field => clientip
cache_size => 1000
}
}
output {
elasticsearch {
host => ladmlogs1
port => 9101
index_prefix => applicationstst
bulk_limit => 100
bulk_timeout => 100
}
}
What happens: My logs are read from the beginning of the input files and node-logstash stops reading the input files after some time. It seems node-logstash stops reading after the last line of the input file; when node-logstash has been started.
What I expect: I would like node-logstash to start reading from the beginning (because I delete the .dbfile) and to continue reading the entire file until the log rotation, and the next created file after the truncate..
Thanks for your help
Jérôme
Hello,
I am using the file input plugin without the "tail" mode as you recommended it.
After some time, it seems my input files are not read anymore (I'm not sure but it seems to happen after ~ 1hour) I don't understand it because these files are never stale. There always are new lines written in these files (Apache access_logs), but it seems node-logstash "detach" from them after some time.
I have tried to use the tail mode and with it, my files are read fine even after they get truncated bu the Apache logs rotation. But i would prefer to use the input plugin without tail because tail does not start reading my files from the beginning. (event if I use start_index => 0)
I suspect node-logstash to start reading files from the beginning but to stop reading the file after the last line of the file at the node-logstash startup.
Example: My log file is 10 lines long. I start node-logstash which reads from the 1st line to the line number 10 and stops, Even if a new line number 11 gets reated after node-logstash startup.
What do you think about this ? Any idea about how I can get it to work ?
Here is a part of my config file:
Please notice that I start node-logstash with this parameter:
--db_file=/var/tmp/logs.node-logstash.myapps.dbfile
Before restarting node-logstash I delete this file to ensure the input files are read from the beginning.
Thank you for your help
Jérôme