plathrop / puppet-module-supervisor

Puppet module for configuring the supervisor daemon tool.
BSD 3-Clause "New" or "Revised" License
6 stars 0 forks source link

Implement starting/stopping/getting status of supervisor programs as a puppet provider #53

Closed fungusakafungus closed 10 years ago

fungusakafungus commented 11 years ago

...so we can have tests, add edge cases, notice regressions and have the stuff documented.

This addresses #52 (in some sence, making STARTING a valid process status after restart/update) and #51.

I spent some time trying to stitch the new tests to the old shell/awk implementation, so I can guarantee that the new implementation does the same, but failed...

So there are tests, and we think we fixed some bugs and haven't broken too much.

One (maybe) important change is that the services for individual programs are now having supervisor:: as a prefix. This was done to avoid name conflicts. See line https://github.com/Jimdo/puppet-module-supervisor/compare/puppet_provider#L2L91 Huch, this is wrong way around. The service { "supervisor::${name}": is what this line should look like. The pull request breaks it, not fixes. Disregard this. Fixed.

You can see the docs when you run rspec --format documentation.

Caveat: the tests do not run real supervisorctl. We just copied the output of supervisorctl into tests.

The initial provider implementation was proudly stolen from https://gist.github.com/monkeymantra/4273546, KUDOS to @monkeymantra

Paired with @skrings all the way.

What do you guys and girls think?

andyshinn commented 11 years ago

Very cool! I'm trying this out as it trumps #52 and likely fixes #51 also.

andyshinn commented 11 years ago

Looks good. Solves both #51 and #52. The tests are passing locally for me too:

⮀ rake spec
/Users/ashinn/.rvm/rubies/ruby-1.9.3-p448/bin/ruby -S rspec spec/classes/supervisor::params_spec.rb spec/classes/supervisor_spec.rb spec/defines/supervisor::service_spec.rb spec/functions/supervisor_spec.rb --color
............................................

Finished in 1.02 seconds
44 examples, 0 failures
skrings commented 11 years ago

@plathrop do you want to have a look, i'm looking forward to :ship: it.

andyshinn commented 11 years ago

I'd like to vote for changing the Puppet.warning to a Puppet.notice. In my scenario the processes don't startup immediately. Also, even if the processes do startup immediately, the status will still be STARTING until the startsecs times out.

I continuously use this module in a Vagrant environment and I hate seeing Warnings because they show up in red, which makes it look like a failure to me. Another developer also thought it was failing because of the red warnings. Just a thought!

andyshinn commented 11 years ago

I've also noticed that when you set a service to 'absent', it removes it, but subsequent runs fail with something like:

Error: Could not stop create_thumbnails: create_thumbnails: ERROR (no such process)

Error: /Stage[main]//Node[workerd.eye.fi]/Supervisor::Service[create_thumbnails]/Service[create_thumbnails]/ensure: change from running to stopped failed: Could not stop create_thumbnails: create_thumbnails: ERROR (no such process)
fungusakafungus commented 11 years ago

@andyshinn Regarding ensure => absent:

Good find. The .ini file gets removed before the service is stopped, than the provider can't find the service to stop it.

As I understand it, following is happening: (assuming that there is supervisor::service {'create_thumbnails': ensure => absent} somewhere in the manifest)

  1. file {'/etc/*create_thumbnails*':ensure=>absent} notifies Class['supervisor::update']
  2. supervisor rereads its configs and loses mention of create_thumbnails
  3. service {"supervisor::create_thumbnails":ensure=>stopped} requires both the file and the Class['supervisor::update'], so the file is absent, the main process does not know anything about create_thumbnails.
  4. provider.stop is called. I would think provider.status is called first, but this does not seem to be the case.

We could solve it in following ways:

  1. Handle ERROR (no such process) more graceful in provider.stop.
  2. Revert the dependencies (first stop the service, than delete the config) if we handle ensure => absent

TL;DR: Yes, there is an error when ensure => absent is given.

andyshinn commented 11 years ago

I like solution 2 personally.

On a side note (because I noticed there was some program / process name commits), are there plans or interest in renaming the program and process names? With longer titles I end up with supervisor names like image_processing_create_thumbnails:image_processing_create_thumbnails_00. I expected them to be something like image_processing_create_thumbnails when only one and image_processing_create_thumbnails:0001 when multiple (or a larger sensible number to accommodate very large worker counts).

andyshinn commented 11 years ago

@fungusakafungus any more thoughts on the error and merging this into master? Despite the ensure => 'absent' error, it works well for me in production (I just remove the absentee resources once after a couple Puppet runs).

If you can push a reversal of the dependencies to fix the error I'd be happy to try it out. Anxious to rebase my current development off plathrop/master - I'd like to submit another PR to allow disabling service management altogether (in the cases where a code management deployment tool like Capistrano is used and the command files don't actually exist until Capistrano pushes them).

fungusakafungus commented 11 years ago

@andyshinn I have been working on integration testing with something like serverspec and vagrant to reproduce this error in a test, but I don't have anything useful to date. I will push dependency reversal without formal tests later today or over the weekend and would be happy if you could test it.

plathrop commented 10 years ago

Dependency reversal is definitely the way to go. Overall this looks awesome. Again, I don't use this module in prod anymore, so I don't want to be a gatekeeper. If you all like it, let's merge it ASAP :-)

skrings commented 10 years ago

@andyshinn Would be cool if you can verify that the dependency reversal fixes the ensure => 'absent' error.

andyshinn commented 10 years ago

Ah, I missed the pushed commits. Thanks for notifying me. I'll be able to play around with this in a day or two to confirm.

fungusakafungus commented 10 years ago

@andyshinn, yeah, it would be cool if you'd test this, because I couldn't reproduce the problem myself.

It would be also useful if we could have some kind of puppet apply or similar manual regression test here. For example as done here: #62

andyshinn commented 10 years ago

Cool, I'll use my Vagrant instance as a test and try applying directly on the instance using puppet apply. I've been playing a bit with https://github.com/puppetlabs/rspec-system-puppet. Maybe we would be open to a branch to implement this so we could easily bring up a Vagrant instance to test these kind of regressions?

fungusakafungus commented 10 years ago

rspec-system-puppet or serverspec would be awesome! :rainbow: But one thing after another, a manual test would be enough to finally merge this.

andyshinn commented 10 years ago

So is e7ec321 just merging plathrop/master into this branch? Any of these commits relying on latest from master? That is an undesirable commit for my testing so I'm going to try to just cherry-pick the 4 added commits.

fungusakafungus commented 10 years ago

@andyshinn shrug I was just trying to keep the branch fresh, I haven't quite understood the enable stuff. It would be enough if you would just cherry-pick stuff.

andyshinn commented 10 years ago

Sorry I haven't had time to test this yet (I initially tried to merge in to my working branch, but e7ec321 is conflicting). I should be able to test it out next week.

fungusakafungus commented 10 years ago

I've cut the branch before the merge and pushed it here: https://github.com/Jimdo/puppet-module-supervisor/tree/puppet_provider_without_master_merge Thanks for testing!

andyshinn commented 10 years ago

That makes it much easier! I was able to merge and test this out fairly easily. I provisioned a test worker machine:

 ~/Projects/server ⮀ ⭠ upgrade_supervisor_jimdo  ⮀ vagrant provision worker
[worker] Running provisioner: puppet...
Running Puppet with worker.pp...
Notice: /Stage[main]/Supervisor/File[/etc/supervisord.conf]/content: content changed '{md5}a957251f69236ac9c8a86906ad33c826' to '{md5}ee99725484ad9f1f41e0c0bea0f8d0e2'
Notice: /Stage[main]/Supervisor/Service[supervisord]: Triggered 'refresh' from 1 events
Notice: /Stage[main]//Supervisor::Service[foreign_importer_absorb]/File[/etc/supervisord.d/foreign_importer_absorb.ini]/content: content changed '{md5}1aa7d16382670e3b6371276aeaa17677' to '{md5}0f913ca0eabc880a6ad04c906c4d5c63'
Notice: /Stage[main]//Supervisor::Service[create_thumbnails]/File[/etc/supervisord.d/create_thumbnails.ini]/content: content changed '{md5}984ffc31f1ebbdb5b5ee19c5aabe3dbb' to '{md5}a135170dcec5c10ba238c0d53e7458f1'
Notice: /Stage[main]//Supervisor::Service[foreign_importer_download]/File[/etc/supervisord.d/foreign_importer_download.ini]/content: content changed '{md5}872e4a8604d5c1054328f9f1a8059646' to '{md5}0541b2000b20073ad6abeaa7bf796e53'
Notice: /Stage[main]//Supervisor::Service[pubsub]/File[/etc/supervisord.d/pubsub.ini]/content: content changed '{md5}50657022bff85ee8175965f7ffe6e93d' to '{md5}928e4b6cacd3967fec268680135f2432'
Notice: /Stage[main]//Supervisor::Service[foreign_importer_eyefi]/File[/etc/supervisord.d/foreign_importer_eyefi.ini]/content: content changed '{md5}efbd958214fca29ca88527ac49c088ba' to '{md5}667314a59ac865e4fce6562e97058483'
Notice: /Stage[main]//Supervisor::Service[marketo_mailer]/File[/etc/supervisord.d/marketo_mailer.ini]/content: content changed '{md5}81807aab1b8285d5b37b0ce42d79ba6f' to '{md5}49de58451baa011fe19b8469eaefef29'
Notice: /Stage[main]//Supervisor::Service[foreign_importer_phanfare]/File[/etc/supervisord.d/foreign_importer_phanfare.ini]/content: content changed '{md5}3b220f421ca9d1b04d2c0385767ec3fe' to '{md5}9cccf1e4d27d7bcd276bb80931a9891f'
Notice: /Stage[main]//Supervisor::Service[foreign_importer_facebook]/File[/etc/supervisord.d/foreign_importer_facebook.ini]/content: content changed '{md5}ccb6006b737159740b46e23f95e198fc' to '{md5}6ab19a3f81556ca2e8a03288b2733054'
Notice: /Stage[main]/Supervisor::Update/Exec[supervisor::update]: Triggered 'refresh' from 1 events
Warning: Could not reliably determine status: foreign_importer_absorb is still starting
Warning: Could not reliably determine status: pubsub is still starting
Notice: Finished catalog run in 14.41 seconds

Provision again to make sure no funny business:

 ~/Projects/server ⮀ ⭠ upgrade_supervisor_jimdo  ⮀ vagrant provision worker
[worker] Configuring cache buckets...
[worker] Running provisioner: hostmanager...
[worker] Configuring cache buckets...
[worker] Running provisioner: puppet...
Running Puppet with worker.pp...
Notice: Finished catalog run in 3.95 seconds

Looks good, everything already running. Now set my pubsub service to absent:

 ~/Projects/server ⮀ ⭠ upgrade_supervisor_jimdo  ⮀ vagrant provision worker
[worker] Running provisioner: puppet...
Running Puppet with worker.pp...
Notice: /Stage[main]//Supervisor::Service[pubsub]/Service[supervisor::pubsub]/ensure: ensure changed 'running' to 'stopped'
Notice: /Stage[main]//Supervisor::Service[pubsub]/File[/etc/supervisord.d/pubsub.ini]/ensure: removed
Notice: /Stage[main]/Supervisor::Update/Exec[supervisor::update]: Triggered 'refresh' from 1 events
Notice: /Stage[main]//Supervisor::Service[pubsub]/File[/var/log/supervisor/pubsub]/ensure: removed
Notice: Finished catalog run in 6.28 seconds

Looks good, subsequent run while still absent:

 ~/Projects/server ⮀ ⭠ upgrade_supervisor_jimdo ● ⮀ vagrant provision worker
[worker] Running provisioner: puppet...
Running Puppet with worker.pp...
Notice: Finished catalog run in 3.77 seconds

Working as desired in my Vagrant environment. :shipit:

skrings commented 10 years ago

Cool stuff

fungusakafungus commented 10 years ago

:rainbow: :+1: :rainbow:

skrings commented 10 years ago

Props to @andyshinn for all the feedback, you have been very helpful.

andyshinn commented 10 years ago

image

plathrop commented 10 years ago

You guys rock. So awesome to have a community built up around this module.