TritonDataCenter / containerpilot

A service for autodiscovery and configuration of applications running in containers
Mozilla Public License 2.0
1.12k stars 136 forks source link

No output from backend onChange script #210

Closed konobi closed 8 years ago

konobi commented 8 years ago

I've got a project here where I'm having trouble with the onChange script, but containerpilot doesn't seem to pipe the output from this script to it's own stdout/stderr. I have no indication as to wether the script was run, attempted to be run or if it failed for some reason, so I'm kinda stuck =0(

It could be related to the last releases changes, but I have the same happening with older versions of containerpilot. This report is for 2.4.1.

fitz123 commented 8 years ago

Hello knonobi! Do you see any output if your onChange script is echo "I'm the stdout Output!" ? Also you can make onChange script as something like touch createdByOnChangeScript to see if it does anything

misterbisson commented 8 years ago

You might try changing increasing the log level in the logging config, https://www.joyent.com/containerpilot/docs/configuration#-code-logging--code-

konobi commented 8 years ago

nope, and i have set +x set within the script, so I should be seeing lots of output. The logging is already set at INFO.

misterbisson commented 8 years ago

The logging is already set at INFO

Have you tried DEBUG?

konobi commented 8 years ago

ah... not even echo '==============I AM HERE===============' defined directly in the .json is being shown... lemme try DEBUG. K, there some information being displayed now, but I don't see any output from the process, just information about the go functions being used; I'm not seeing the above output in the docker logs =0( I can't even tell if it's attempting to be run on reload... =0/

fitz123 commented 8 years ago

I'd guess that your onChange script is not triggered, probably because ether containerpilot have problem to communicate with consul or you're using wrong name in backend section

with log INFO level echos should be visible.

That's my example, try simplify it:

  1. Run Container
{
  "consul": "{{ .CONSUL }}",
  "logging": {"level": "INFO"},
  "services": [
    {
      "name": "origin-proxy",
      "port": 8888,
      "poll": 2,
      "interfaces": ["10.0.0.0/8"],
      "ttl": 5
    }
  ],
  "backends": [
    {
      "name": "origin-proxy",
      "poll": 1,
      "onChange": "echo OnCHange",
      "timeout": "1s"
    }
  ]
}

Immediately after run log:

2016/08/17 19:53:36 OnCHange

Then manually deregistered service: curl-XPUT 'localhost:8500/v1/agent/service/deregister/origin-proxy-admin'

Then logs output:

2016/08/17 19:53:57 Unexpected response code: 500 (CheckID does not have associated TTL)
Service not registered, registering...
2016/08/17 19:54:00 OnCHange
tgross commented 8 years ago

If you can share the debug logs and containerpilot.json config file (redacted if necessary) in a gist that might be helpful for us to help debug this.

konobi commented 8 years ago

ah... so does the onChange handler not get called on initial startup or containerpilot reload time?

fitz123 commented 8 years ago

As far as I know It suppouse to be called on Change of monitored consul service. For startup routines you can use preStart option. Don't know about reload though, never used it You can refer to configuration example: "https://www.joyent.com/containerpilot/docs/configuration"

konobi commented 8 years ago

I'm already using the prestart to do a whole bunch of things, it's just that the onChange script that I have is designed purely for making sure the backend service is up and marking something as "primary". Since that needs to occur on any change to that backend service, that's all it does.

However, since it is never run for any of the instances on start up, none of them will be attempting to claim themselves as primary. When the containerpilot process comes up (or reloads), I was assuming that it was going to fire these scripts since "starting" and "reloading" are states in and of themselves.

I could call the onChange script from prestart I suppose, but it seems kinda odd.

tgross commented 8 years ago

Yeah the onChange only gets fired when a monitored backend changes. Once the preStart exits successfully, the various polling loops start (async) and then the main application is started.

When the containerpilot process comes up (or reloads), I was assuming that it was going to fire these scripts since "starting" and "reloading" are states in and of themselves.

The onChange hooks are associated with the backends, not with the main application. You could have the service watch itself as a backend, but that seems overly complicated for what you're trying to do here.

konobi commented 8 years ago

ah... i had expected that a "starting" state would have been thought of as an onChange event since it's up to the image to decide which one is primary.

I'm having a little more luck now with trying to run the onChange script at the end of the prestart script, but it gets into nasty issues of daemonizing properly and tty issues, etc. since the onChange script just wants to be spun off initially. I have to push all the output out to /dev/null because otherwise the prestart script doesn't exit until it's file descriptors are cleaned up.

I support you could call it a preexec hook, since it's only interested in being called when the main process is being run, but along side it. Any suggestions on how to achieve the same sort of thing?

fitz123 commented 8 years ago

@konobi , if I understand your use case correctly:

  1. Use consul's tags to mark service as primary/secondary. You have to set "enableTagOverride": true in consul to make it work.
  2. Mark service as a Slave by default
  3. Use both prestart and onchange hooks to check status of the service
  4. Became Primary in case no healthy nodes are detected
misterbisson commented 8 years ago

Use consul's tags to mark service as primary/secondary

I'd suggest using locks, rather than tags. You can see that in use in https://github.com/autopilotpattern/mysql in bin/manage.py. Locking semantics are well suited to situations where there can be only one primary.

misterbisson commented 8 years ago

does the onChange handler not get called on initial startup or containerpilot reload time?

since it is never run for any of the instances on start up, none of them will [...] When the containerpilot process comes up (or reloads), I was assuming that it was going to fire [the onChange] scripts [...]

@fitz123 and @tgross correctly clarified that the backendsonChange handler only applies to the monitored backend specified in the name field. See https://www.joyent.com/containerpilot/docs/configuration#-code-backends--code- for docs on that.

The underlying question here is how to execute code before the main application starts, or before it's registered in the service catalog. The canonical example for simple cases can be found in autopilotpattern/nginx's containerpilot.json:

{
  "consul": "{{ if .CONSUL_AGENT }}localhost{{ else }}{{ .CONSUL }}{{ end }}:8500",
  "preStart": "/usr/local/bin/reload.sh preStart",
  "logging": {"level": "DEBUG"},
  "services": [
    {
      "name": "nginx",
      "port": 80,
      "health": "/usr/bin/curl --fail --silent --show-error --output /dev/null http://localhost/nginx-health",
      "poll": 10,
      "ttl": 25
    }
  ],
  "backends": [
    {
      "name": "{{ .BACKEND }}",
      "poll": 7,
      "onChange": "/usr/local/bin/reload.sh"
    }
  ],
[...]

In that example, Nginx needs to have its configuration template written out with any backends before it starts. The solution there is to use a preStart handler that is almost identical to the onChange handler, except that it doesn't attempt to reload Nginx.

MySQL is a more complex application with a suitable more sophisticated set of handlers. Nonetheless, it's still using the normal lifecycle events in ContainerPilot. The preStart runs before MySQL starts, but we're also leveraging serviceshealth and watching for changes to the mysql-primary in backendsonChange. As noted in my comment above, bin/manage.py in https://github.com/autopilotpattern/mysql may offer a number of solution that you can borrow from.

I'm going to label this as usage, and I've added a note in https://github.com/joyent/containerpilot/issues/197#issuecomment-240557097 to improve the docs about this.

konobi commented 8 years ago

Okay, I ended up with something working. I ended up making a single non-restarting co-process that calls my onChange script. So the script is kicked off at the right time, ensures that all instances can attempt to become primary all while stateless. The only stateful part is on the primary where it uses a consul session lock which is renewed by the health script.

It'd be handy to maybe see a flag on the backends so that on start/reload the onChange script is run (90% of the time it'll just do nothing, but gives it the option of doing something, like becoming the primary).

This is for a mysql cluster, so it's pretty similar to the mysql image you already have, but I wanted to start of with the foundations of the image being about cluster management and stability (specific to our env). We're also not a manta user, so all the additions around backup/replica/standby, etc. weren't of use to us and the replication is based on galera, so we had to use a different approach. I did refer it it a lot during development though.

tgross commented 8 years ago

Looks like this is resolved then.