Closed GoogleCodeExporter closed 9 years ago
There are several things you put into 1 issue :-)
Error level 255 is also returned when the network connection fails - it is a
classic temporary error.
Please check if the recently released Lsyncd 2.0.5 fixes this two problems. It
differentiates between startup and normal operation where an errorlevel 255
during startup will make a fail, while during normal operation will run well.
The rsyncBinary has also been made configureable.
"waiting for ",np," more child processes." happens when Lsyncd is shutting
down. Do you have a log with this? And before this is happening? Preferable
with "-log all" turned on. I suppose your inotify queue overflowed and Lsyncd
thus resets.
Original comment by axk...@gmail.com
on 26 Aug 2011 at 8:14
I was just trying to give you any relevant information regarding our patch to
lsyncd in case there was something we did that might have instigated the issue
:)
I'll check into 2.0.5.
The lsync log fragment I pasted you is a section of a 35GB file with more of
the same. We ended up deleting the file as it filled up our /var/log, but when
we see it again I will try to get you what happened just before the spin.
We've tuned out inotify queues pretty high. We previously had issues with that
and tuned them to be quite large, since initially we pushed LOTs of data which
generated tons of inotify events (we handle all of it fine now). The traffic
has actually tapered off quite a bit, so it's probably not inotify queues
overflowing any longer. It seems that lsyncd is waiting indefinitely for
processes that may have died unexpectedly or been killed off?
Thanks for the quick feedback!
Original comment by ari...@gmail.com
on 26 Aug 2011 at 8:28
Should not be possible that Lsyncd waits for a process that doesn't exist.
Either it still runs, hangs, or it exists as zombie waiting to be collected
from Lsyncd. If the pid isn't there but Lsyncd still didn't collect the
process, there must something quite amiss.
The 100% CPU thing while restarting or shutting down should be fixed with
following patch (to lsyncd 2.0.5). Please try:
axel@prospectionist:~/lsyncd$ svn diff
Index: lsyncd.lua
===================================================================
--- lsyncd.lua (revision 587)
+++ lsyncd.lua (working copy)
@@ -2920,6 +2920,9 @@
-- times ... the alarm time (only read if number is 1)
--
function runner.getAlarm()
+ if lsyncdStatus ~= "run" then
+ return false
+ end
local alarm = false
----
-- checks if current nearest alarm or a is earlier
Original comment by axk...@gmail.com
on 26 Aug 2011 at 8:41
Thanks, I will give it a whirl.
Original comment by ari...@gmail.com
on 30 Aug 2011 at 2:41
Done with 2.0.5
Original comment by axk...@gmail.com
on 16 Sep 2011 at 10:12
Original issue reported on code.google.com by
ari...@gmail.com
on 26 Aug 2011 at 7:25Attachments: