Atalanta / cucumber-chef

Framework for test-driven infrastructure development
http://cucumber-chef.org
Apache License 2.0
265 stars 55 forks source link

Cucumber-chef setup hangs #119

Closed edyu closed 11 years ago

edyu commented 11 years ago

I'm just following the book and I'm stuck at this point.

It seems the culprit is the following code:

echo -n "Waiting on chef-validator.pem and chef-webui.pem to appear..." until [ -f /etc/chef-server/chef-validator.pem ] && [ -f /etc/chef-server/chef-webui.pem ]; do echo -n "." sleep 1 done echo "done."

The log shows:

zpatten commented 11 years ago

Looks like you using Chef 11.x; so I had this happen the other day for the first time. In my case the Chef 11.x OmniTruck installation failed; it aborted during the .deb package download. If you look further back in the log you should be able to see if the Chef-Solo OmniTruck exploded or finished successfully. I'm guessing yours exploded like mine did; as the chef-validator private key never gets generated if OmniTruck is not successful.

In my case I just re-ran the setup task and OmniTruck completed. Unfortunately this is an issue (if your issue turns out to be the same as mine) with OpsCode's web server aborting the download for some reason.

Let me know what the log shows. If OmniTruck did complete for you please send me the log via a GIST.

zpatten commented 11 years ago

Also I would suggest looking at the WIKI documentation linked in the README.md; alot has changed in regards to the workflow and how cucumber-chef operates since the book was released.

edyu commented 11 years ago

I am using 11.4. Here is the part of log that shows something wrong:

Starting Chef Client, version 11.4.0^[[0m^M Compiling Cookbooks...^[[0m^M ^[[0m^M ================================================================================^[[0m^M ^[[31mRecipe Compile Error in /tmp/chef-solo/cookbooks/chef-server/recipes/default.rb^[[0m^M ================================================================================^[[0m^M ^M ^[[0m^M RuntimeError^[[0m^M ------------^[[0m^M Could not locate chef-server package matching version 'latest' for node.^[[0m^M ^M ^[[0m^M Cookbook Trace:^[[0m^M ---------------^[[0m^M /tmp/chef-solo/cookbooks/chef-server/recipes/default.rb:29:in `from_file'^[[0m^M ^M ^[[0m^M Relevant File Content:^[[0m^M ----------------------^[[0m^M /tmp/chef-solo/cookbooks/chef-server/recipes/default.rb:^M ^M 22: node['chef-server']['prereleases'],^M 23: node['chef-server']['nightlies'])^M 24: unless omnibus_package^M 25: err_msg = "Could not locate chef-server"^M 26: err_msg << " pre-release" if node['chef-server']['prereleases']^M 27: err_msg << " nightly" if node['chef-server']['nightlies']^M 28: err_msg << " package matching version '#{node['chef-server']['version']}' for node."^M 29>> raise err_msg^M 30: end^M 31: else^M 32: omnibus_package = node['chef-server']['package_file']^M 33: end^M 34: ^M 35: package_name = ::File.basename(omnibus_package)^M 36: package_local_path = "#{Chef::Config[:file_cache_path]}/#{package_name}"^M 37: ^M 38: # omnibus_package is remote (ie a URI) let's download it^M ^[[0m^M
^M ^[[0m^M Chef Client failed. 0 resources updated^[[0m^M

zpatten commented 11 years ago

I've just run an 11.4 setup with the chef version set to latest and had no issue.

I suggest you re-run setup; if that fails, destroy the lab and run setup again.

...snip...
+ bash install.sh -v latest
Downloading Chef latest for ubuntu...
Installing Chef latest
Selecting previously unselected package chef.
(Reading database ... ^M(Reading database ... 5%^M(Reading database ... 10%^M(Reading database ... 15%^M(Reading database ... 20%^M(Reading database ... 25%^M(Reading database ... 30%^M(Reading database ... 35%^M(Reading database ... 40%^M(Reading database ... 45%^M(Reading database ... 50%^M(Reading database ... 
55%^M(Reading database ... 60%^M(Reading database ... 65%^M(Reading database ... 70%^M(Reading database ... 75%^M(Reading database ... 80%^M(Reading database ... 85%^M(Reading database ... 90%^M(Reading database ... 95%^M(Reading database ... 100%^M(Reading database ... 52493 files and directories currently instal
led.)
Unpacking chef (from .../chef_latest_amd64.deb) ...
Setting up chef (11.4.0-1.ubuntu.11.04) ...
Thank you for installing Chef!
+ mkdir -pv /var/chef/cache /tmp/chef-solo/cookbooks/chef-server
mkdir: created directory `/var/chef'
mkdir: created directory `/var/chef/cache'
mkdir: created directory `/tmp/chef-solo/cookbooks/chef-server'
+ wget -qO- https://github.com/opscode-cookbooks/chef-server/archive/master.tar.gz
+ tar xvzC /tmp/chef-solo/cookbooks/chef-server --strip-components=1
chef-server-master/.gitignore
chef-server-master/Berksfile
chef-server-master/CHANGELOG.md
chef-server-master/CONTRIBUTING.md
chef-server-master/Gemfile
chef-server-master/LICENSE
chef-server-master/README.md
chef-server-master/Thorfile
chef-server-master/Vagrantfile
chef-server-master/attributes/
chef-server-master/attributes/default.rb
chef-server-master/chefignore
chef-server-master/libraries/
chef-server-master/libraries/dev_helper.rb
chef-server-master/libraries/omnitruck_client.rb
chef-server-master/metadata.rb
chef-server-master/recipes/
chef-server-master/recipes/default.rb
chef-server-master/recipes/dev.rb
chef-server-master/templates/
chef-server-master/templates/default/
chef-server-master/templates/default/chef-server.rb.erb
+ chef-solo --config /etc/chef/solo.rb --json-attributes /tmp/chef-solo/attributes.json --logfile /var/log/chef/chef-solo.log --log_level debug
Starting Chef Client, version 11.4.0ESC[0m
Compiling Cookbooks...ESC[0m
Converging 61 resourcesESC[0m
Recipe: chef-server::defaultESC[0m
  * remote_file[/tmp/chef-solo//chef-server_11.0.6-1.ubuntu.12.04_amd64.deb] action createESC[0mESC[32m
    - copy file downloaded from [] into /tmp/chef-solo//chef-server_11.0.6-1.ubuntu.12.04_amd64.debESC[0mESC[37m
        (file sizes exceed 10000000 bytes, diff output suppressed)ESC[0m
ESC[0m
  * package[chef-server_11.0.6-1.ubuntu.12.04_amd64.deb] action installESC[0mESC[32m
    - install version 11.0.6-1.ubuntu.12.04 of package chef-server_11.0.6-1.ubuntu.12.04_amd64.debESC[0m
ESC[0m
  * directory[/etc/chef-server] action createESC[0mESC[32m
    - create new directory /etc/chef-serverESC[0mESC[32m
    - change owner from '' to 'root'ESC[0mESC[32m
    - change group from '' to 'root'ESC[0m
ESC[0m
  * template[/etc/chef-server/chef-server.rb] action createESC[0m--- /tmp/chef-tempfile20130413-4398-1bscisn    2013-04-13 04:47:49.685327377 +0000
...snip...
zpatten commented 11 years ago

FYI; in the future please supply the full log via a gist; snippets don't really allow me to see the full context of what's going on with your bootstrap.

zpatten commented 11 years ago

Here is the output from the 11.4 run: https://github.com/zpatten/cc-chef-repo/blob/master/README.md using v3.0.6; configuration is listed at the bottom.

zpatten commented 11 years ago

Also; 11.4 is the latest chef-client version; the latest omnitruck server package is chef-server_11.0.6-1.ubuntu.12.04_amd64.deb (i.e. v11.0.6)

edyu commented 11 years ago

Ok. Let me try to rerun everything. :)

zpatten commented 11 years ago

https://github.com/opscode-cookbooks/chef-server#attributes :latest is the default value for the OmniTruck installer as well as Cucumber-Chef.

Setting the chef server version to 11.4.0 via the cucumber-chef config will not work as OmniTruck does not have a chef-server package for that version (at least last time I tried it).

Let me know what happens with the run.

edyu commented 11 years ago

Now it's stuck and I think it's the extra space that's messing up with the expect; there is a space after the : in "Please enter a password for the new user: "

zpatten commented 11 years ago

I've never seen an issue with this before; there must be something else going on here.

You can do a cucumber-chef ssh --bootstrap to get into the test lab; sudo su - to go to root and try running it manually.

Also, again, those small few lines of logs don't really help that much.

+ cp -v /etc/chef-server/chef-validator.pem /etc/chef-server/chef-webui.pem /home/vagrant/.chef
`/etc/chef-server/chef-validator.pem' -> `/home/vagrant/.chef/chef-validator.pem'
`/etc/chef-server/chef-webui.pem' -> `/home/vagrant/.chef/chef-webui.pem'
+ ln -sv /etc/chef-server/chef-validator.pem /etc/chef/validation.pem
`/etc/chef/validation.pem' -> `/etc/chef-server/chef-validator.pem'
+ ln -sv /etc/chef-server/admin.pem /etc/chef/admin.pem
`/etc/chef/admin.pem' -> `/etc/chef-server/admin.pem'
+ cat
+ tee /tmp/knife-config.exp
#!/usr/bin/expect -f
set timeout 10
spawn knife configure -i --server-url https://127.0.0.1 --admin-client-key /etc/chef-server/admin.pem -u vagrant -r '' --defaults --yes -VV
expect "Please enter a password for the new user:" { send "p@ssw0rd1\n" }
interact
+ chmod +x /tmp/knife-config.exp
+ /tmp/knife-config.exp
spawn knife configure -i --server-url https://127.0.0.1 --admin-client-key /etc/chef-server/admin.pem -u vagrant -r '' --defaults --yes -VV^M^M
WARNING: No knife configuration file found^M
Creating initial API user...^M
Please enter a password for the new user: p@ssw0rd1

DEBUG: Signing the request as admin
DEBUG: Sending HTTP Request via POST to 127.0.0.1:443/users
DEBUG: ---- HTTP Status and Header Data: ----
DEBUG: HTTP 1.1 502 Bad Gateway
DEBUG: server: nginx/1.2.3
DEBUG: date: Sat, 13 Apr 2013 04:49:41 GMT
DEBUG: content-type: text/html
DEBUG: content-length: 172
DEBUG: connection: close
DEBUG: ---- End HTTP Status/Header Data ----
ERROR: Server returned error for https://127.0.0.1/users, retrying 1/5 in 3s
DEBUG: ---- HTTP Status and Header Data: ----
DEBUG: HTTP 1.1 201 Created
DEBUG: server: nginx/1.2.3
DEBUG: date: Sat, 13 Apr 2013 04:49:45 GMT
DEBUG: content-type: application/json
DEBUG: content-length: 2268
DEBUG: connection: close
DEBUG: x-ops-api-info: flavor=osc;version=11.0.2;erchef=1.2.6
DEBUG: location: https://127.0.0.1/users/vagrant
DEBUG: ---- End HTTP Status/Header Data ----
Created user[vagrant]
Configuration file written to /home/vagrant/.chef/knife.rb
+ knife client create zpatten -a -f /home/vagrant/.chef/zpatten.pem --disable-editing --yes -VV
DEBUG: Signing the request as vagrant
DEBUG: Sending HTTP Request via PUT to 127.0.0.1:443/clients/zpatten
DEBUG: ---- HTTP Status and Header Data: ----
DEBUG: HTTP 1.1 404 Object Not Found
DEBUG: server: nginx/1.2.3
DEBUG: date: Sat, 13 Apr 2013 04:49:46 GMT
DEBUG: content-length: 40
DEBUG: connection: close
DEBUG: x-ops-api-info: flavor=osc;version=11.0.2;erchef=1.2.6
DEBUG: ---- End HTTP Status/Header Data ----
DEBUG: Signing the request as vagrant
DEBUG: Sending HTTP Request via POST to 127.0.0.1:443/clients
DEBUG: ---- HTTP Status and Header Data: ----
DEBUG: HTTP 1.1 201 Created
DEBUG: server: nginx/1.2.3
DEBUG: date: Sat, 13 Apr 2013 04:49:46 GMT
DEBUG: content-type: application/json
DEBUG: content-length: 2273
DEBUG: connection: close
DEBUG: x-ops-api-info: flavor=osc;version=11.0.2;erchef=1.2.6
DEBUG: location: https://127.0.0.1/clients/zpatten
DEBUG: ---- End HTTP Status/Header Data ----
Created client[zpatten]
zpatten commented 11 years ago

You are using a stock ubuntu precise image for the test lab yes?

edyu commented 11 years ago

I'm recompiling vim so I can actually copy the whole file to gist. I'm using the AMI published by Ubuntu. It's the 64bit micro one.

edyu commented 11 years ago

https://gist.github.com/edyu/5377434

zpatten commented 11 years ago

Why not just truncate the logfile, do the run. Open a new term, cat the log, copy-n-paste and your done. You shouldn't need to recompile vim, that seems excessive. On Apr 12, 2013 11:59 PM, "Ed Yu" notifications@github.com wrote:

I'm recompiling vim so I can actually copy the whole file to gist. I'm using the AMI published by Ubuntu. It's the 64bit micro one.

— Reply to this email directly or view it on GitHub.

zpatten commented 11 years ago

That gist still doesn't tell me much.

Gist me up the entire log from the failed setup run and your entire config.rb (redact anything sensitive like AWS creds if you hardcoded them).

edyu commented 11 years ago

Just updated the gist with the whole file. Cat did the trick. :)

zpatten commented 11 years ago

Your instance is running out of memory, chef-solo explodes, chef-server never starts, knife stalls.

================================================================================
Error executing action `restart` on resource 'service[erchef]'
================================================================================

Errno::ENOMEM
-------------
Cannot allocate memory - fork(2)

Resource Declaration:
---------------------
# In /opt/chef-server/embedded/cookbooks/runit/definitions/runit_service.rb

169:     service params[:name] do
170:       control_cmd = node[:runit][:sv_bin]
171:       if params[:owner]
172:         control_cmd = "#{node[:runit][:chpst_bin]} -u #{params[:owner]} #{control_cmd}"
173:       end
174:       provider Chef::Provider::Service::Simple

Compiled Resource:
------------------
# Declared in /opt/chef-server/embedded/cookbooks/runit/definitions/runit_service.rb:169:in `block in from_file'

service("erchef") do
  params {:directory=>"/opt/chef-server/sv", :only_if=>false, :finish_script=>false, :control=>[], :run_restart=>true, :active_directory=>"/opt/chef-server/service", :init_script_template=>nil, :owner=>"root", :group=>"root", :template_name=>"erchef", :start_command=>"start", :stop_command=>"stop", :restart_command=>"restart", :status_command=>"status", :options=>{:log_directory=>"/var/log/chef-server/erchef", :svlogd_size=>1000000, :svlogd_num=>10, :directory=>nil, :only_if=>false, :finish_script=>false, :control=>[], :run_restart=>true, :active_directory=>nil, :init_script_template=>nil, :owner=>"root", :group=>"root", :template_name=>nil, :start_command=>"start", :stop_command=>"stop", :restart_command=>"restart", :status_command=>"status", :options=>{}, :env=>{}, :action=>:enable, :down=>false}, :env=>{}, :action=>:enable, :down=>false, :name=>"erchef"}
  provider Chef::Provider::Service::Simple
  action [:nothing]
  supports {:restart=>true, :status=>true}
  retries 0
  retry_delay 2
  service_name "erchef"
  pattern "erchef"
  start_command "/opt/chef-server/embedded/bin/chpst -u root /opt/chef-server/embedded/bin/sv start /opt/chef-server/service/erchef"
  stop_command "/opt/chef-server/embedded/bin/chpst -u root /opt/chef-server/embedded/bin/sv stop /opt/chef-server/service/erchef"
  status_command "/opt/chef-server/embedded/bin/chpst -u root /opt/chef-server/embedded/bin/sv status /opt/chef-server/service/erchef"
  restart_command "/opt/chef-server/embedded/bin/chpst -u root /opt/chef-server/embedded/bin/sv restart /opt/chef-server/service/erchef"
  startup_type :automatic
  cookbook_name :"chef-server"
  recipe_name "erchef"
end

[2013-04-13T06:17:09+00:00] ERROR: Running exception handlers
[2013-04-13T06:17:09+00:00] ERROR: Exception handlers complete
[2013-04-13T06:17:09+00:00] FATAL: Stacktrace dumped to /opt/chef-server/embedded/cookbooks/cache/chef-stacktrace.out
[2013-04-13T06:17:09+00:00] FATAL: Errno::ENOMEM: service[erchef] (chef-server::erchef line 169) had an error: Errno::ENOMEM: Cannot allocate memory - fork(2)
STDERR: Generating RSA private key, 2048 bit long modulus
..+++
......................................+++
e is 65537 (0x10001)
---- End output of chef-server-ctl reconfigure ----
Ran chef-server-ctl reconfigure returned 1

Resource Declaration:
---------------------
# In /tmp/chef-solo/cookbooks/chef-server/recipes/default.rb

 80: execute "reconfigure-chef-server" do
 81:   command "chef-server-ctl reconfigure"
 82:   action :nothing
 83: end
 84: 

Compiled Resource:
------------------
# Declared in /tmp/chef-solo/cookbooks/chef-server/recipes/default.rb:80:in `from_file'

execute("reconfigure-chef-server") do
  action [:nothing]
  retries 0
  retry_delay 2
  command "chef-server-ctl reconfigure"
  backup 5
  returns 0
  cookbook_name :"chef-server"
  recipe_name "default"
end

Chef Client failed. 4 resources updated
edyu commented 11 years ago

https://gist.github.com/edyu/5377453

zpatten commented 11 years ago

Ya it seems your instance is low on memory; that is likely what is wreaking the havoc.

zpatten commented 11 years ago

You'll need to go with at least a small; I doubt the micro meets the minimum chef-server requirements.

zpatten commented 11 years ago

The micro only has about 600MB of memory and no swap; so once the memory tops out, the kernel starts killing processes and preventing forks, etc.

zpatten commented 11 years ago

http://serverfault.com/questions/375466/which-ec2-instance-best-for-chef-server

zpatten commented 11 years ago

If money is an issue I would suggest using the vagrant provider.

zpatten commented 11 years ago

FWIW; I believe the reason your knife is hanging is because expect is getting killed off or prevent from starting due to memory usage.

edyu commented 11 years ago

Thank you so much. No, it's not money issue. I was just inexperience in figuring out which one to use and I didn't realize that I'm actually installing a whole chef-server on the node as I thought we are using the hosted solution.

edyu commented 11 years ago

The cloudimage from ubuntu has "ec2-run-instances ami-fc002cb9 -t t1.micro --region us-west-1 --key ${EC2_KEYPAIR_US_WEST_1}" as its command so I thought I should just use t1.micro with the image. Obviously I should've thought it through.

zpatten commented 11 years ago

Hey, no worries bro; I was starting to scratch my head there, but my gut told me something else had to be going on to cause that; :+1: for gut feelings lol.

It might be possible on a micro with 10.x, I can imagine the memory footprint growing from v10 to v11, but I've never compared to be honest.

When I was using the AWS provider primarily at my last job, running smalls and mediums was the way I went and the smalls seemed to work well, but mediums got slightly better performance.

Also be sure to check out my cucumber-chef example chef-repo: https://github.com/zpatten/cc-chef-repo

Hopefully it will be a good place to draw some help from if you get stuck in general.

Let me know if you hit anymore issues.

edyu commented 11 years ago

Than you again.