rcbops / chef-cookbooks

RCB OPS - Chef Cookbooks
Other
118 stars 102 forks source link

Graphite causing apache to restart unnecessarily #885

Open cfarquhar opened 10 years ago

cfarquhar commented 10 years ago

Apache is restarting during every chef-client run. It appears to be caused by an interaction between horizon::default and graphite::graphite that causes /etc/apache2/ports.conf to be updated each time. This is happening on the v4.1.2 cookbooks.

I think these are the relevant chef-client.log messages:

[2014-03-16T06:43:18+00:00] INFO: Processing template[/etc/apache2/ports.conf] action create (horizon::default line 195)
[2014-03-16T06:43:18+00:00] INFO: template[/etc/apache2/ports.conf] backed up to /var/chef/backup/etc/apache2/ports.conf.chef-20140316064318
[2014-03-16T06:43:18+00:00] INFO: template[/etc/apache2/ports.conf] removed backup at /var/chef/backup/./etc/apache2/ports.conf.chef-20140316050923
[2014-03-16T06:43:18+00:00] INFO: template[/etc/apache2/ports.conf] updated file contents /etc/apache2/ports.conf
[2014-03-16T06:43:18+00:00] INFO: template[/etc/apache2/ports.conf] not queuing delayed action restart on service[apache2] (delayed), as it's already been queued
[2014-03-16T06:43:18+00:00] INFO: template[/etc/apache2/ports.conf] not queuing delayed action restart on service[apache2] (delayed), as it's already been queued

--- 8< ---

[2014-03-16T06:43:20+00:00] INFO: Processing template[/etc/apache2/ports.conf] action create (graphite::graphite line 126)
[2014-03-16T06:43:20+00:00] INFO: template[/etc/apache2/ports.conf] backed up to /var/chef/backup/etc/apache2/ports.conf.chef-20140316064320
[2014-03-16T06:43:20+00:00] INFO: template[/etc/apache2/ports.conf] removed backup at /var/chef/backup/./etc/apache2/ports.conf.chef-20140316054039
[2014-03-16T06:43:20+00:00] INFO: template[/etc/apache2/ports.conf] updated file contents /etc/apache2/ports.conf
[2014-03-16T06:43:20+00:00] INFO: template[/etc/apache2/ports.conf] not queuing delayed action restart on service[apache2] (delayed), as it's already been queued
[2014-03-16T06:43:20+00:00] INFO: template[/etc/apache2/ports.conf] not queuing delayed action restart on service[apache2] (delayed), as it's already been queued
[2014-03-16T06:43:20+00:00] INFO: template[/etc/apache2/ports.conf] not queuing delayed action restart on service[apache2] (delayed), as it's already been queued

--- 8< ---

[2014-03-16T06:43:24+00:00] INFO: template[/etc/apache2/ports.conf] sending restart action to service[apache2] (delayed)
[2014-03-16T06:43:24+00:00] INFO: Processing service[apache2] action restart (apache2::default line 221)
breu commented 10 years ago

Can you verify that this is still happening after upgrading to the latest 4.1.x branch?

claco commented 10 years ago

@rackerjoe @cfarquhar Most likely, this needs to be sorted to it is rendered the same each time: https://github.com/rcbops-cookbooks/horizon/blob/v4.1.5rc/recipes/server.rb#L68

@cfarquhar Can you run a diff between the file created /etc/apache2/ports.conf and the file backed up by chef /var/chef/backup/etc/apache2/ports.conf.chef-20140316050923 (or any of them) and post the output?

I'm assuming that the ports in the file just change order, causing a file difference, and then a restart,

claco commented 10 years ago

In fact, we're laying down ports twice, once in horizon as a resource rewind of apache2s version, and once in graphite as a clone.

https://github.com/rcbops-cookbooks/graphite/blob/master/recipes/graphite.rb#L101

cfarquhar commented 10 years ago

This should help. With each run we alternate between having and not having an IP for the Listen and NameVirtualHost directives.

root@xxxxxx-controller01:~# md5sum /var/chef/backup/etc/apache2/ports.conf.chef-201403161*
1fb85d901766a1381f215bee4246011e  /var/chef/backup/etc/apache2/ports.conf.chef-20140316122801
9a60a3430b62e8838220d4dcf8d8ebae  /var/chef/backup/etc/apache2/ports.conf.chef-20140316125919
1fb85d901766a1381f215bee4246011e  /var/chef/backup/etc/apache2/ports.conf.chef-20140316125921
9a60a3430b62e8838220d4dcf8d8ebae  /var/chef/backup/etc/apache2/ports.conf.chef-20140316133039
1fb85d901766a1381f215bee4246011e  /var/chef/backup/etc/apache2/ports.conf.chef-20140316133041

root@xxxxxx-controller01:~# diff -u /etc/apache2/ports.conf /var/chef/backup/etc/apache2/ports.conf.chef-20140316133041
--- /etc/apache2/ports.conf     2014-03-16 13:30:41.702108427 +0000
+++ /var/chef/backup/etc/apache2/ports.conf.chef-20140316133041 2014-03-16 13:30:39.066029543 +0000
@@ -1,10 +1,10 @@
 #This file generated via template by Chef.
-Listen 80
-NameVirtualHost *:80
+Listen 0.0.0.0:80
+NameVirtualHost 0.0.0.0:80

-Listen 443
-NameVirtualHost *:443
+Listen 0.0.0.0:443
+NameVirtualHost 0.0.0.0:443

-Listen 8080
-NameVirtualHost *:8080
+Listen 0.0.0.0:8080
+NameVirtualHost 0.0.0.0:8080

root@xxxxxx-controller01:~# diff -u /etc/apache2/ports.conf /var/chef/backup/etc/apache2/ports.conf.chef-20140316133039
root@xxxxxx-controller01:~#
claco commented 10 years ago

@cfarquhar Thank you sir. That will definitely help in tracking this down. out of curiosity, do both versions work (with, and without 0.0.0.0:)?

mancdaz commented 10 years ago

The problem with cloned resources and chef-rewind, is that chef-rewind will only find and edit the first instance of the resource that it finds in the resource collection.

chef-rewind in the referenced issue is doing it's thang, but then the instance of the resource that's being created in the graphite cookbook is throwing things off again.

claco commented 10 years ago

This is why chef-edit exists. But cloning is usually the least of our problems. In the spc book, we do resources :type => "name" directly and have no clone warnings.

mancdaz commented 10 years ago

@claco I disconcur. cloning is the problem here. We edit one instance of the template resource to use the correct template (with listen IP:port), but the other resource does not get edited so it lays down the wrong template.

claco commented 10 years ago

Clones are mistakes. If your cookbook creates a clone, edit the resource instead. If upstream does, patch upstream. This is the game. I'm not saying it's great. So, the other resource should also be edited, rather than cloned-via-same-name.

mancdaz commented 10 years ago

@claco yes so we're both essentially saying that both instances of the resource need to be edited/rewound. Whether they are cloned with same name, or instanciated with different names - they are essentially the same resource that lay down the same file, and both need to be edited.

As it happens, the template in the upstream cookbook has been updated such that we don't need to rewind at all any more. As suggested by @odyssey4me in the referenced issue, we can just set the correct attrs before creating the resource.