kostya / eye

Process monitoring tool. Inspired from Bluepill and God.
MIT License
1.19k stars 86 forks source link

command :info, objects not found! #100

Closed dmerrick closed 9 years ago

dmerrick commented 9 years ago

First, I want to thank you for your prompt help on #99. We've upgraded to 0.7.pre and as far as I can tell it's no longer timing out.

Unfortunately, we occasionally get the output command :info, objects not found! from the info command.

Usually this is in the context of an init script, so the command is eye info resque_workers. It would be really nice to track down why this is happening, so we can start to trust reliability of eye info.

kostya commented 9 years ago

Something strange, if command :info, objects not found!, there is no any objects with name 'resque_workers'. Try next time compare output eye info without filter an with filter, to find where is problem.

dmerrick commented 9 years ago

Let me get back to you on this...

dmerrick commented 9 years ago

I think the issue here is related to eye load. We have an init script that runs eye commands, but it will occasionally fail with errors like this. If I run eye load before I run those commands, it completes without error.

kostya commented 9 years ago

If you mean races. Between load and info no races, load is atomic command which block all other commands. So, i not understand the issue, you have no resque_workers in eye somehow, may be forgot to load first time. Or something else, if you look with info you should see, that there is no resque_workers. What the issue here, how to reproduce?

dmerrick commented 9 years ago

Let me see if I can shed more light on this. We manage eye apps using the chef eye cookbook. It creates an init script for us that you can see here.

case "$1" in
  start)
    echo -n "Loading eye configuration for $SERVICE_NAME"
    execute load $CONFIG_FILE
    execute start $SERVICE_NAME
    ;;
  stop)
    execute stop $SERVICE_NAME
    ;;
  restart)
    execute restart $SERVICE_NAME
    ;;
  status)
    execute info $SERVICE_NAME
    ;;
  *)
    echo "Usage: $0 {start|stop|restart|status}"
    exit 1
    ;;
esac

Since our workers are started via this script, theoretically the config should be loaded. But sometimes it seems that the config has not been loaded, so we get the error in the title of this issue. If I add execute load $CONFIG_FILE to the beginning of stop, restart, and status, this error goes away.

kostya commented 9 years ago

So, the problem is that config not loaded? What eye info output in that case. What eye log says? Was loaded ever or not? May be this is problem of chef eye cookbook, or something else, you can digg at it by eye log.

dmerrick commented 9 years ago

Is it possible that the config might not be loaded in certain circumstances? We run eye:

  1. over SSH (via capistrano)
  2. from chef
  3. from shell scripts (like above)
  4. from the command line

Sometimes these commands fail because eye does not seem to be aware of our eye config. (We only have a single eye config file on these servers.)

Running eye i when we get these errors returns an empty response, if I recall correctly,

kostya commented 9 years ago

Not sure that config might not be loaded, never meet this (we using cap and shell). If eye i is empty, it means that config was not loaded at all, or may be eye die somehow, and than up without config, or someone delete all processes from eye. You can find out by eye log. What the eye x -c output, is there config or not? Show also output of eye x.

kostya commented 9 years ago

any progress here?