basho-labs / puppet-riak

A puppet module to deploy Riak clusters
Apache License 2.0
33 stars 37 forks source link

Failed to call refresh: Could not stop Service[riak] #21

Closed andyshinn closed 11 years ago

andyshinn commented 11 years ago

I'm trying to get a VM up in vagrant with Puppet. Everything works after the second pass. Howeverm the first pass gives me the following output and error starting (stopping?) Riak:

Running Puppet with riak.pp...
Notice: /Stage[main]//Service[iptables]/ensure: ensure changed 'running' to 'stopped'
Notice: /Stage[main]/Riak::Config/File[/etc/security/limits.conf]/content: content changed '{md5}667a77a9a360468f97d30f97eb775e47' to '{md5}a5ae0675396f1ae690f5b624c41b6c83'
Notice: /Stage[main]/Riak/File[/etc/riak]/ensure: created
Notice: /Stage[main]/Riak::Vmargs/File[/etc/riak/vm.args]/ensure: created
Notice: /Stage[main]/Riak/Group[riak]/ensure: created
Notice: /Stage[main]/Riak/User[riak]/ensure: created
Notice: /Stage[main]/Riak::Appconfig/File[/var/log/riak]/ensure: created
Notice: /Stage[main]/Riak::Appconfig/File[/var/lib/riak]/ensure: created
Notice: /Stage[main]/Riak::Appconfig/File[/usr/lib/riak]/ensure: created
Notice: /Stage[main]/Riak::Appconfig/File[/etc/riak/app.config]/ensure: created
Notice: /Stage[main]/Riak::Config/Yumrepo[basho-products]/descr: descr changed '' to 'basho packages for $releasever-$basearch'
Notice: /Stage[main]/Riak::Config/Yumrepo[basho-products]/baseurl: baseurl changed '' to 'http://yum.basho.com/el/6/products/$basearch'
Notice: /Stage[main]/Riak::Config/Yumrepo[basho-products]/enabled: enabled changed '' to '1'
Notice: /Stage[main]/Riak::Config/Yumrepo[basho-products]/gpgcheck: gpgcheck changed '' to '1'
Notice: /Stage[main]/Riak::Config/Yumrepo[basho-products]/gpgkey: gpgkey changed '' to 'http://yum.basho.com/gpg/RPM-GPG-KEY-basho'
Notice: /Stage[main]/Riak/Package[riak]/ensure: created
Notice: /Stage[main]/Riak/Service[riak]/ensure: ensure changed 'stopped' to 'running'
Error: /Stage[main]/Riak/Service[riak]: Failed to call refresh: Could not stop Service[riak]: Execution of '/sbin/service riak stop' returned 1: 
Error: /Stage[main]/Riak/Service[riak]: Could not stop Service[riak]: Execution of '/sbin/service riak stop' returned 1: 
Notice: Finished catalog run in 67.07 seconds

My Puppet manifest for this VM:

service { 'iptables':
  ensure => 'stopped',
  enable => false,
}

class { 'riak':
  vmargs_cfg    => {
    '-name'     => "riak@0.0.0.0",
  }
}

Any idea why this might be happening?

haf commented 11 years ago

That looks like a problem with the package not installing riak properly after having been ensured 'created'. If you have the second output, we can see there what actions were taken that were not taken with this output.

Also, another thing I'd like to see is if you at this point ssh into the machine and run '/sbin/service riak stop'.

jsmartin commented 11 years ago

Which version of Riak is it installing? Assuming it's the latest -- Riak 1.4.

On Fri, Jul 26, 2013 at 2:54 AM, Henrik Feldt notifications@github.comwrote:

That looks like a problem with the package not installing riak properly after having been ensured 'created'. If you have the second output, we can see there what actions were taken that were not taken with this output.

Also, another thing I'd like to see is if you at this point ssh into the machine and run '/sbin/service riak stop'.

— Reply to this email directly or view it on GitHubhttps://github.com/basho/puppet-riak/issues/21#issuecomment-21604495 .

andyshinn commented 11 years ago

After the failure and SSH in to stop the service I get:

[vagrant@riak ~]$ sudo /sbin/service riak stop
[vagrant@riak ~]$ echo $?
0

At this point, the only Riak related process running is:

riak      3125  0.0  0.0  10824   432 ?        S    18:07   0:00 /usr/lib64/riak/erts-5.9.1/bin/epmd -daemon

If I run vagrant provision (to apply the Puppet manifest again) I get:

Notice: /Stage[main]/Riak/File[/etc/riak]/owner: owner changed 'riak' to 'root'
Notice: /Stage[main]/Riak/File[/etc/riak]/group: group changed 'riak' to 'root'
Notice: /Stage[main]/Riak/Service[riak]/ensure: ensure changed 'stopped' to 'running'
Notice: Finished catalog run in 2.99 seconds

At this point the service appears to be running just fine:

[vagrant@riak ~]$ riak ping                 
pong

If I SSH after the initial Puppet reported error I can also restart the service and it comes up just fine:

[vagrant@riak ~]$ sudo /etc/init.d/riak restart
Starting riak:                                             [  OK  ]
andyshinn commented 11 years ago

Oh, and yes, I am running version 1.4.0 (riak version doesn't output anything though):

[vagrant@riak ~]$ riak version

[vagrant@riak ~]$ rpm -qa | grep riak
riak-1.4.0-1.el6.x86_64
andyshinn commented 11 years ago

Here is what I think is happening:

The init.d script appears to support the restart command and actually checks if it is running first:

restart|force-reload)
        [ $running -eq 0 ] && stop
        start
        ;;

I added hasrestart => true to the service resource and it appears to be working correctly now (refresh does a restart instead of a stop / start and since it is not fully running, it does not stop, but start just exits 0).

Maybe a valid fix is adding a param for $has_restart?

haf commented 11 years ago

The init script will always have hasrestart, no? Perhaps a pull request? =) Thanks for the debugging you have done and your writing to explain it!

andyshinn commented 11 years ago

Yea, I think most init script now days have a 'restart' option. Puppet actually defaults to hasrestart => false for the service resource. I can probably fix this up and will make a pull request, no problemo!

andyshinn commented 11 years ago

Whoa... I did something seriously wrong... I'll open a new pull request.

haf commented 11 years ago

Closing