elastic / logstash

Logstash - transport and process your logs, events, or other data
https://www.elastic.co/products/logstash
Other
14.21k stars 3.5k forks source link

Logstash skipping lines when moving between files #1902

Closed kryten68 closed 9 years ago

kryten68 commented 10 years ago

In Windows the two log files below are to be processed by the conf also below.

The first file, app1.log is read in and processed just fine.

The second file is then processed - BUT the first three lines are skipped entirely. The second file typically starts at 'Identified 1 listener(s).' It does this every single time.

This appears to only happen on Windows platform and happens on LS 1.4.1 and LS 1.4.2

Have tried all sorts of combinations and have now virtually given up. Would deeply appreciate any insights into what may be going on as this problem is seriously damaging to our uptake of the ELK stack.

THANKS

The text below is app1.log: 2014-09-09 00:00:00.000+0100 Starting... 2014-09-09 00:00:01.000+0100 Started 2014-09-09 00:00:02.000+0100 Listening... 2014-09-09 01:00:00.000+0100 Command received [shutdown] 2014-09-09 01:00:01.000+0100 Shutting down...

The text below is app2.log: 2014-09-09 00:00:00.000+0100 Starting initialization 2014-09-09 00:00:01.000+0100 Initialization completed successfully 2014-09-09 00:00:02.000+0100 Scanning for listeners 2014-09-09 00:00:03.000+0100 Identified 1 listener(s) 2014-09-09 00:00:04.000+0100 Registering 1 listeners(s) 2014-09-09 00:00:05.000+0100 Registered listeners #1 2014-09-09 00:00:06.000+0100 Registration complete successfully

Here is the conf (very simple):

input { file { path => [ "C:\ELK\app1.log", "C:\ELK\app2.log" ] start_position => "beginning" sincedb_path => "C:\ELK.sincedb" } }

filter {

    grok {
            match => [ "message", "(?<datetime>\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d\.\d\d\d\+\d\d\d\d)\s+(?<text>.*)" ]
    }

    if "_grokparsefailure" not in [tags] {

            date {
                    match => [ "datetime", "YYYY-MM-dd HH:mm:ss.SSSZ" ]
            }
    }

}

output { stdout { codec => rubydebug } }

torrancew commented 10 years ago

From what several of us in #logstash can tell, this seems like a potential bug on Windows platforms. No one has been able to reproduce this behavior in OS X or Linux, but the OP has seen this issue consistently on numerous Windows boxes.

jordansissel commented 10 years ago

I'll take a look next week.

torrancew commented 10 years ago

@jordansissel much appreciated. I know that @kryten68 has a call scheduled with ES tomorrow morning, as he's on a deadline - not sure if that info is helpful to you at all.

jordansissel commented 10 years ago

I just resurrected my old windows laptop (I currently haven't invested in a dedicated windows testing environment) and might be able to reproduce this and provide an update before next week. I'm overdue on delivering the new build tooling for logstash, so I need to finish that first. :)

kryten68 commented 10 years ago

Thanks! Much appreciated.

I sent you a PM.. The appetite for ELK is very strong... Our command of the stack is pretty good, I think. Once we get past this bug, uptake should be very rapid.

Thank you, both, very much.

torrancew commented 10 years ago

@jordansissel of course! Just wanted to provide the additional context that was established in #logstash this AM. As always, thank you!

jordansissel commented 10 years ago

<3

kryten68 commented 10 years ago

Thanks Jordan. Any assistance you can provide with this issue would be very much appreciated. I'll look forward to hearing from you and continue to test and try to troubleshoot at this end.

kryten68 commented 10 years ago

gentlest-of-bumps-in-fact-barely-a-graze

wiibaa commented 10 years ago

@kryten68 as a workaround did you try splitting to one file input per file with a different sincedb each time ? It is a long time that I did not check but AFAIK the issues related to windows and sincedb management are still actual sadly ...

kryten68 commented 10 years ago

Interesting... using a distinct file path and sincedb per event log does work. Just tested it and it appears to process line 1 from file 1 then line 1 from file 2 then line 2 from file 1 then line 2 from file 2 .. and so on.

This is good insight - but sadly we need to be able to monitor an entire folder of incoming logs (glob the path and use *.log) so not sure how we could work around the problem with this technique.

Sounds like this is not going to be fixed anytime soon....

kryten68 commented 10 years ago

Wiibaa suggested using a .sincedb path of "NUL" as a workaround. That helps immensely too.

With that setting the first three or more lines are still skipped initially but then they re-appear at the end of the output stream; so that's a reasonable workaround but at the cost of a functional .sincedb.

Would love to see this fixed...

wiibaa commented 10 years ago

@kryten68 I preffered to remove my comment about the sincedb to dev/null because with some more testing the behaviour is still unreliable and complex to debug/understand. I will try to dive in ruby-filewatch supporting library that should contains fixes for windows that are not integrated in logstash yet. Another workaround is to use the nio2path proposed contribution, dropping the raw file to your lib/logstash/inputs folder with the config

input {
  nio2path {
    path => 'file:///D:/tmp/*.log'
    start_position => "beginning"
  }
}

Please see https://github.com/elasticsearch/logstash-contrib/pull/35 for details

kryten68 commented 10 years ago
Thank you very much - really appreciate this. I got the nio2path.rb and saved it into the input folder. I'm getting this error when trying to use it though: C:\ELK>C:\ELK\logstash-1.4.2\bin\logstash.bat agent -f demo.conf ←[33mUsing milestone 1 input plugin 'nio2path'. This plugin should work, but would benefit from use by folks like you. Please let us know if you find bugs or have suggestions on how to +---------------------------------------------------------+ An unexpected error occurred. This is probably a bug. You can find help with this problem in a few places:
* chat: #logstash IRC channel on freenode irc.
IRC via the web: http://goo.gl/TI4Ro
* email: logstash-users@googlegroups.com
* bug system: https://logstash.jira.com/

+---------------------------------------------------------+ The error reported is: missing class or uppercase package name (`java.nio.file.Paths')

C:\ELK>

On 17 October 2014 12:30, Wiibaa notifications@github.com wrote:

@kryten68 https://github.com/kryten68 I preffered to remove my comment about the sincedb to dev/null because with some more testing the behaviour is still unreliable and complex to debug/understand. I will try to dive in ruby-filewatch supporting library that should contains fixes for windows that are not integrated in logstash yet. Another workaround is to use the nio2path proposed contribution, dropping the raw file https://raw.githubusercontent.com/semiosis/logstash-contrib/LOGSTASH-1201/lib/logstash/inputs/nio2path.rb to your lib/logstash/inputs folder with the config

input { nio2path { path => 'file:///D:/tmp/*.log' start_position => "beginning" } }

Please see elasticsearch/logstash-contrib#35 https://github.com/elasticsearch/logstash-contrib/pull/35 for details

— Reply to this email directly or view it on GitHub https://github.com/elasticsearch/logstash/issues/1902#issuecomment-59500124 .

wiibaa commented 10 years ago

@kryten68 forgot to mentions that you need Java 7

kryten68 commented 10 years ago

@wiibaa Thanks. I'm running JSE7 Update 51 though...

wiibaa commented 10 years ago

@kryten68 seems strange. 100% sure that JAVA_HOME points to it echo %JAVA_HOME% ??

kryten68 commented 10 years ago

Thanks. Getting a bit further with it now:

C:\ELK>logstash-1.4.2\bin\logstash.bat agent -f demo.conf ←[33mUsing milestone 1 input plugin 'nio2path'. This plugin should work, but would benefit from use by folks like you. Please let us know if you find bugs or have suggestions on how to improve this plugin. ←[33mUnable to open path {:dirname=>"C:/elk/problem/", :error=>java.nio.file.FileSystemNotFoundException: Provider "C" not installed, :level=>:warn}←[0m +---------------------------------------------------------+ An unexpected error occurred. This is probably a bug. You can find help with this problem in a few places:
* chat: #logstash IRC channel on freenode irc.
IRC via the web: http://goo.gl/TI4Ro
* email: logstash-users@googlegroups.com
* bug system: https://logstash.jira.com/

+---------------------------------------------------------+ The error reported is: Provider "C" not installed

I can confirm:

C:\ELK>echo %JAVA_HOME% C:\Program Files\Java\jre7

C:\ELK>

On 17 October 2014 13:13, Wiibaa notifications@github.com wrote:

@kryten68 https://github.com/kryten68 seems strange. 100% sure that JAVA_HOME points to it echo %JAVA_HOME% ??

— Reply to this email directly or view it on GitHub https://github.com/elasticsearch/logstash/issues/1902#issuecomment-59503844 .

wiibaa commented 10 years ago

@kryten68 you should use file URI syntax in your config like path => 'file:///C:/ELK/*.log'

kryten68 commented 10 years ago

Bingo! Works perfectly and all messages are emitted in sequence. Superb. \ THANK YOU **

Will this be the 'forward fix'? Or will "file" be repaired?

wiibaa commented 10 years ago

I cannot commit to what Logstash team will eventually do but I'm sure it's only a matter of time/priority management for @jordansissel to be able to fix the filewatch lib ;)

electrical commented 10 years ago

The thing is that the nio2path only works under Java. When people want to run Logstash under Ruby it won't work.

At the moment we are working on making all plugins into separate repo's ( See https://github.com/logstash-plugins ) So i think we can include the nio2path into separate plugin for people to use. But i'm sure we will be fixing the current file input :-)

wiibaa commented 10 years ago

@electrical the madness is that filewatch master is currently also containing java/jruby specific code for windows handling too, if we take time to validate it can a new gem version be made and used in next version of logstash like 1.4.3 and others... Or should a ruby-windows guru should be seek and hired

electrical commented 10 years ago

@wiibaa we defo should check if the current master of filewatch works as expected under windows or not. ( i hope it does ) but i'm not sure on what term we will be able to do that. or some one could try it out for us? :-)

wiibaa commented 10 years ago

@electrical this thread caught my attention and motivation, just need to find a little bit of time ;)

elvarb commented 10 years ago

The work around I have been using is to use nxlog to read the files and send to a tcp socket in Logstash.

kryten68 commented 9 years ago

Hi,

Could someone please clarify whether the 'file' input on Windows issue, as described in this thread has been addressed yet? We have been using, where appropriate the nio2path input instead, but the absence of the .sincedb has caused problems.

I'd be happy to help test any potential fixes or workarounds for the problem as originally described, if that would be helpful?

Would really appreciate a reply setting out the current position with it - thanks!

tuespetre commented 9 years ago

Just because I have not seen anyone explicitly mention it, this is what I've seen:

log file length position where file input starts reading it
150127.log 59238 0
150202.log 527171 59238
150203.log 729280 527171
150204.log 259862 0
150205.log 609396 259862

So it starts reading, moves to the next file, tries to start at character with the position equal to the length of the previous file, and if the file is not that long, starts back to character 0 for that file (as we see with 150204.log.)

kryten68 commented 9 years ago

Yes. Starting file y at the line number reached in file x is classic behaviour for this issue. Only on Windows though - never seen this on *nix or Mac. Thanks for confirming.

kryten68 commented 9 years ago

Is there any update on this issue at all? We have been trying to work around it with the nio2 path input but that doesn't accept path globbing just filename globbing.

Is there ANY kind of work around or fix on the horizon for this issue (missing entire files when globbing paths on Windows).

Thanks.

elvarb commented 9 years ago

I would just use nxlog in Windows to read file logs and parse them, then ship those to logstash as json

jordansissel commented 9 years ago

@kryten68 there are patches in the file watching library we use that should fix this. If you note the milestone on this ticket, we anticipate having this resolved for the 1.5.0 rc1 release.

tuespetre commented 9 years ago

@jordansissel any idea for a date on that release?

@kryten68 @elvarb I am having success with the modified version here: https://github.com/jordansissel/ruby-filewatch/issues/39#issuecomment-66992110 using the tip given here: https://github.com/jordansissel/ruby-filewatch/issues/39#issuecomment-67136389. There is a disclaimer about using it in production but I have had success with it for the time being.

suyograo commented 9 years ago

Fixed in v0.6.1 of filewatch gem

JeanFrancoisContour commented 9 years ago

I had the same problem (Windows, missing lines) so I 've installed logstash-1.5.0.beta1 but I did not see any improvement Still missing lines for Logstash file input In my test, I copy/paste files one by one in a directory which is monitored by a Logstash agent. Logstash still misses the first lines (10, 11, 16... ). Only one sincedb file Windows 7 Enterprise I may have missed something in the installation process. I only noticed to get jruby-complete-1.7.19 and store it in C:\logstash-1.5.0.beta1\vendor\jar Thanks for great work around ELK

ph commented 9 years ago

@JeanFrancoisContour We just pushed a new release which include the new version of the filewatch that should fixes some of the windows issues. Could you redo your test with this version: http://www.elasticsearch.org/blog/announcing-logstash-1-5-0-release-candidate/ ?

JeanFrancoisContour commented 9 years ago

Issue resolved with the Logstash 1.5.0 RC1 Tests OK for me Thanks

OuesFa commented 9 years ago

@JeanFrancoisContour i have the same problem on windows with logstash 1.4.2. To resolve this issue, did you use the 'logstash core download' without adding any plugin or features ? Or did you use 'logstash 1.5.0 RC2 download' ? Thanks for your help.

JeanFrancoisContour commented 9 years ago

I used Logstash 1.5.0 RC1 but I guess Logstash 1.5.0 RC2 (http://download.elasticsearch.org/logstash/logstash/logstash-1.5.0.rc2.zip) is OK too

OuesFa commented 9 years ago

Ok thanks. I'm trying it. But a problem occurred. It starts to parse my logs and ship them into ES, it works for a couple of seconds then it stops and i have an error message when i shut down the pipeline ^C (←[33m_sincedb_write rename/sync failed: c:\'path of my siincedb'. new -> 'path of my since_db': Permission denied )

Don't know if it stops because of this error or if it's another issue. I found this topic (https://github.com/logstash-plugins/logstash-input-file/issues/16). It says that the error message has no effect on logstash @JeanFrancoisContour did u have this message with RC1 ?

JeanFrancoisContour commented 9 years ago

I had sometimes a "since_db permission denied" message when I shut down the pipeline too, but I don't really care. My advice is to investigate somewhere else (parse failure ?)

OuesFa commented 9 years ago

Thanks for quick reply.

I thought it was a connection problem between LS and ES but even when i change my output conf to this one (output { if "_grokparsefailure" not in [tags] {stdout {}}}) Adding a condition on parse failures too, i still have the same problem, logstash starts then stops without reporting any error.

OuesFa commented 9 years ago

I had multiline filters in my LS conf that i removed. i'm testing right now and it seems to work. So i have to find a workaround to multiline filter i guess.