cerner / cerner_kafka

A Kafka Cookbook for Chef
Apache License 2.0
30 stars 25 forks source link

broker restart fails #68

Closed noslowerdna closed 5 years ago

noslowerdna commented 5 years ago

The /var/log/kafka/kafka_init_stdout.log file showed:

No kafka server to stop

for this chef-client failure:

  * service[kafka] action stop

    ================================================================================
    Error executing action `stop` on resource 'service[kafka]'
    ================================================================================

    Mixlib::ShellOut::ShellCommandFailed
    ------------------------------------
    Expected process to exit with [0], but received '1'
    ---- Begin output of /sbin/service kafka stop ----
    STDOUT: Attempting to shutdown Kafka...
    Attempting to shutdown Kafka...
    Error stopping Kafka
    STDERR:
    ---- End output of /sbin/service kafka stop ----
    Ran /sbin/service kafka stop returned 1

    Resource Declaration:
    ---------------------
    # In /var/chef/cache/cookbooks/cerner_kafka/recipes/default.rb

    207: service "kafka" do
    208:   action [:enable, :start]
    209:   supports :status => true, :restart => true
    210: end
    211:

    Compiled Resource:
    ------------------
    # Declared in /var/chef/cache/cookbooks/cerner_kafka/recipes/default.rb:207:in `from_file'

    service("kafka") do
      action [:enable, :start]
      supports {:status=>true, :restart=>true}
      retries 0
      retry_delay 2
      default_guard_interpreter :default
      service_name "kafka"
      enabled nil
      running nil
      masked nil
      pattern "kafka"
      declared_type :service
      cookbook_name "cerner_kafka"
      recipe_name "default"
    end

when attempting to update from 2.2.0 to 2.2.1

I don't know what exactly changed to cause this to start happening, but the cookbook should be more robust and not fail out if it can't stop a broker because it was not detected as running.

noslowerdna commented 5 years ago

I think this might just be a timing issue where the init.d script's sleep time roughly matches the time it takes for the broker process to shutdown.