a3rd / molniya

Automatically exported from code.google.com/p/molniya
GNU General Public License v2.0
0 stars 0 forks source link

molniya terminates on parser error #22

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Leave molniya running on my debian sid system running icinga 1.2.1
2. Molniya terminates at random time with trace found below

What is the expected output? What do you see instead?
Software should be robust and not bomb out. Yeah I could run it in a while 
loop, but it could just be robust in the first place.
Why not back off for $time and retry parsing the file or something like that?

What version of the product are you using? On what operating system?
trunk, r47
debian sid, icinga 1.2.1

Please provide any additional information below.

/usr/src/molniya-trunk/nagios.rb:75:in `parse_object': unexpected line:  
(RuntimeError)
        from /usr/src/molniya-trunk/nagios.rb:55:in `parse_status'
        from /usr/src/molniya-trunk/nagios.rb:496:in `parse'
        from /usr/lib/ruby/1.8/pathname.rb:812:in `open'
        from /usr/lib/ruby/1.8/pathname.rb:812:in `open'
        from /usr/src/molniya-trunk/nagios.rb:496:in `parse'
        from /usr/src/molniya-trunk/nagios.rb:159:in `_refresh'
        from /usr/src/molniya-trunk/nagios.rb:127:in `refresh_if_needed'
        from /usr/lib/ruby/1.8/monitor.rb:242:in `synchronize'
        from /usr/src/molniya-trunk/nagios.rb:124:in `refresh_if_needed'
        from /usr/src/molniya-trunk/nagios.rb:442:in `refresh_if_needed'
        from /usr/src/molniya-trunk/nagios.rb:164:in `contents'
        from /usr/src/molniya-trunk/molniya.rb:445:in `status_report'
        from /usr/src/molniya-trunk/molniya.rb:623:in `update_status_msg'
        from /usr/src/molniya-trunk/molniya.rb:568:in `run'
        from /usr/src/molniya-trunk/molniya.rb:801:in `launch'
        from -e:1

Original issue reported on code.google.com by dm8...@gmail.com on 30 Oct 2010 at 2:17

GoogleCodeExporter commented 9 years ago
Right now I seem to have icinga in a state which makes this problem 
reproducible. I've saved both status.dat and . Let me know if you want those as 
I'm not going to attach them to a public bug.

In addition here's the relevant part of a strace -f. Maybe that gives you an 
idea of what's happening.

[pid 10688] gettimeofday({1288680165, 578538}, NULL) = 0
[pid 10688] stat64("/var/lib/icinga/status.dat", {st_mode=S_IFREG|0664, 
st_size=53296, ...}) = 0
[pid 10688] stat64("/var/lib/icinga/status.dat", {st_mode=S_IFREG|0664, 
st_size=53296, ...}) = 0
[pid 10688] open("/var/lib/icinga/status.dat", O_RDONLY|O_LARGEFILE) = 3
[pid 10688] fstat64(3, {st_mode=S_IFREG|0664, st_size=53296, ...}) = 0
[pid 10688] mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, 
-1, 0) = 0xb776c000
[pid 10688] read(3, "################################"..., 4096) = 4096
[pid 10688] read(3, "apping=0\n\tpercent_state_change=0"..., 4096) = 4096
[pid 10688] read(3, "=0\n\tcheck_command=check-host-ali"..., 4096) = 4096
[pid 10688] read(3, "0\n\tretry_interval=1.000000\n\teven"..., 4096) = 4096
[pid 10688] rt_sigprocmask(SIG_SETMASK, [PIPE], NULL, 8) = 0
[pid 10688] close(3)                    = 0
[pid 10688] munmap(0xb776c000, 4096)    = 0
[pid 10688] write(2, "/usr/src/molniya-trunk/nagios.rb"..., 
53/usr/src/molniya-trunk/nagios.rb:75:in `parse_object') = 53
[pid 10688] write(2, ": ", 2: )           = 2
[pid 10688] write(2, "unexpected line: ", 17unexpected line: ) = 17
[pid 10688] write(2, " (", 2 ()           = 2
[pid 10688] write(2, "RuntimeError", 12RuntimeError) = 12
[pid 10688] write(2, ")\n", 2)  
)          = 2

Hope that helps.

Original comment by dm8...@gmail.com on 2 Nov 2010 at 6:55

GoogleCodeExporter commented 9 years ago
When one specific host came back online I was able to start molniya again.
After looking at both status files, to me there are two things that look like 
possible causes here:
- there is an empty line in that status block when it's down
- the plugin_output filed contains an IPv6 address

See also attached snippets.

PS: I'm not convinced that this is also the source of the random termination 
during parsing. More likely it fails parsing something else too.

Original comment by dm8...@gmail.com on 3 Nov 2010 at 6:49

Attachments: