circus-tent / circus

A Process & Socket Manager built with zmq
http://circus.readthedocs.org/
Other
1.55k stars 258 forks source link

'circusctl stop' followed by 'circusctl start' breaks logging for watcher #912

Closed AndrewKlopper closed 9 years ago

AndrewKlopper commented 9 years ago

I'm not sure if this issue is related to https://github.com/circus-tent/circus/issues/884 but the symptoms seem slightly different

I have a watcher configured as follows:

[watcher:consul]
cmd = /usr/bin/consul
args = agent -config-dir /etc/consul -data-dir /var/lib/consul
uid = consul
gid = consul

stdout_stream.class = FileStream
stdout_stream.max_bytes = 1048576
stdout_stream.backup_count = 7
stdout_stream.filename = /var/log/circus/consul.log

stderr_stream.class = FileStream
stderr_stream.max_bytes = 1048576
stderr_stream.backup_count = 7
stderr_stream.filename = /var/log/circus/consul_err.log

[env:consul]
GOMAXPROCS=2

Logging to /var/log/circus/consul.log works fine unless I stop and start the watcher:

circusctl stop consul
circusctl start consul

At this point, nothing is written to /var/log/circus/consul.log, and the following error is logged to the circusd.log file:

2015-07-22 17:07:37 tornado.application[8229] [ERROR] Exception in callback (4, <function null_wrapper at 0x7fce6849e0c8>)
Traceback (most recent call last):
  File "/usr/share/circus/local/lib/python2.7/site-packages/tornado/ioloop.py", line 866, in start
    handler_func(fd_obj, events)
  File "/usr/share/circus/local/lib/python2.7/site-packages/tornado/stack_context.py", line 275, in null_wrapper
    return fn(*args, **kwargs)
  File "/usr/share/circus/local/lib/python2.7/site-packages/circus/stream/redirector.py", line 29, in __call__
    self.redirector.redirect[self.name](datamap)
  File "/usr/share/circus/local/lib/python2.7/site-packages/circus/stream/file_stream.py", line 105, in __call__
    if self._should_rollover(data['data']):
  File "/usr/share/circus/local/lib/python2.7/site-packages/circus/stream/file_stream.py", line 143, in _should_rollover
    self._file.seek(0, 2)  # due to non-posix-compliant Windows feature
ValueError: I/O operation on closed file

The only way to fix it is to stop and start circusd.

k4nar commented 9 years ago

Are you using the last version of Circus?

AndrewKlopper commented 9 years ago

The latest version on pip (0.12.0).

k4nar commented 9 years ago

Ok. It might be linked to #884's fix (#885) then. Could you try with Circus 0.11?

AndrewKlopper commented 9 years ago

It seems to work correctly with circus 0.11.1

k4nar commented 9 years ago

Ok, so I would say that it's definitively linked to #885. There must be something wrong in Watcher.start when the Redirector is closed.

ayashjorden commented 9 years ago

Hi @k4nar, We're using circus for the last month and encountered the behaviour that @AndrewKlopper mentioned. Is there any update/timeframe for a fix?

Thanks, Yarden

k4nar commented 9 years ago

There isn't any timeframe for now, but I'll try to take a look at it today.

ayashjorden commented 9 years ago

Great! Thanks

ayashjorden commented 9 years ago

Adding my environment config: OpenVZ container running Debian7 with 8GB RAM.

Thanks, Yarden

k4nar commented 9 years ago

@ayashjorden @AndrewKlopper : Can you try out #913?

ayashjorden commented 9 years ago

Sure, I just need a little help on how to install the version from the PR.

k4nar commented 9 years ago

@ayashjorden : pip install -U git+git://github.com/circus-tent/circus.git@\#912 should do the trick.

ayashjorden commented 9 years ago

Re-installed from scratch with the command above, still getting the following error from circus log file (/var/log/circus.log):

Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/tornado/ioloop.py", line 866, in start
    handler_func(fd_obj, events)
  File "/usr/local/lib/python2.7/dist-packages/tornado/stack_context.py", line 275, in null_wrapper
    return fn(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/circus/stream/redirector.py", line 29, in __call__
    self.redirector.redirect[self.name](datamap)
  File "/usr/local/lib/python2.7/dist-packages/circus/stream/file_stream.py", line 105, in __call__
    if self._should_rollover(data['data']):
  File "/usr/local/lib/python2.7/dist-packages/circus/stream/file_stream.py", line 143, in _should_rollover
    self._file.seek(0, 2)  # due to non-posix-compliant Windows feature
ValueError: I/O operation on closed file

The watcher log files stats:

root@bosun-sandbox:/var/log/bosun# stat bosun.err 
  File: `bosun.err'
  Size: 5302        Blocks: 16         IO Block: 4096   regular file
Device: 32h/50d Inode: 50466504    Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2015-08-03 13:33:45.341823537 +0000
Modify: 2015-08-03 13:33:45.341823537 +0000
Change: 2015-08-03 13:33:45.341823537 +0000
 Birth: -

As you can see the current time is ~13:40 UTC and the file mod/access is 13:33.

k4nar commented 9 years ago

@ayashjorden : According to your Traceback, you are not running with the patch (the line 105 in file_stream.py is now at line 109).

Maybe you don't have restarted Circus ?

ayashjorden commented 9 years ago

@k4nar , did you mean "#913" in the pip install command?

k4nar commented 9 years ago

Oh yes, sorry about that!

ayashjorden commented 9 years ago

It seems that the PIP command doesn't work. Also tried pip install -U git+git://github.com/circus-tent/circus.git@994986832999a7b671316830d55e84b74fcfbff8, but no luck.

Maybe I can clone the branch and create the package for pip ?

ayashjorden commented 9 years ago

Ok, got it to install: pip install -U git+git://github.com/circus-tent/circus.git@994986832999a7b671316830d55e84b74fcfbff8#egg=circus

The egg part was missing :)

Restarted circus few times => no errors restarted the watcher few times => no errors

Seems that the fix worked.

Thanks, Yarden

k4nar commented 9 years ago

Ok, thanks for trying. I'll merge the fix and make a new release.

ayashjorden commented 9 years ago

Cool! Any timeframe for that?

k4nar commented 9 years ago

I'll try to do it today.

ayashjorden commented 9 years ago

Super awesome, thanks a lot :)

k4nar commented 9 years ago

@ayashjorden : Circus 0.12.1 is on PyPi :) .