chef-boneyard / opscode-pushy-server

Chef Push Jobs Server
https://docs.chef.io/push_jobs.html
Apache License 2.0
16 stars 10 forks source link

Test suite improvements #136

Closed stevendanna closed 7 years ago

stevendanna commented 7 years ago
stevendanna commented 7 years ago

Here is an example of the full debug log output:

    when the node nacks the quorum
[2016-07-27 11:49:28 +0100][INFO ]: Starting client DONKEY
[2016-07-27 11:49:28 +0100][INFO ]: Waiting 5 seconds for ["DONKEY"] to come online
[2016-07-27 11:49:28 +0100][DEBUG]: All nodes now online!
[2016-07-27 11:49:28 +0100][INFO ]: Waiting 30 seconds for ["DONKEY"] to enter availability == available
[2016-07-27 11:49:28 +0100][DEBUG]: Got job_stat_change: {:state=>:idle, :job_id=>nil, :command=>nil}
[2016-07-27 11:49:28 +0100][DEBUG]: Node DONKEY has availability = available
[2016-07-27 11:49:28 +0100][INFO ]: Waiting 30 seconds for ["DONKEY"] to enter availability == available
[2016-07-27 11:49:28 +0100][DEBUG]: Node DONKEY has availability = available
[2016-07-27T11:49:28+01:00] ERROR: [DONKEY] Received commit e5af89913daa803961f933b9e13df501, but command 'bad command' is not in the whitelist!
[2016-07-27 11:49:28 +0100][INFO ]: Waiting 30 seconds for https://api.chef-server.dev/organizations/pedant_testorg_api_15583/pushy/jobs/e5af89913daa803961f933b9e13df501 to finish
[2016-07-27 11:49:29 +0100][DEBUG]: Validating events. Expected 3: [#<struct Event name=:start, id="job-1", json={"timestamp"=>"2016-07-27 10:49:28.898527Z", "command"=>"bad command", "run_timeout"=>4, "quorum"=>1, "node_count"=>1, "user"=>"pedant_admin_user_api_15583", "capture"=>false}>, #<struct Event name=:quorum_vote, id="job-2", json={"timestamp"=>"2016-07-27 10:49:28.910443Z", "node"=>"DONKEY", "status"=>"failure"}>, #<struct Event name=:job_complete, id="job-3", json={"timestamp"=>"2016-07-27 10:49:28.910542Z", "status"=>"quorum_failed"}>]
[2016-07-27 11:49:29 +0100][DEBUG]: Got job_stat_change: {:state=>:idle, :job_id=>nil, :command=>nil}
[2016-07-27 11:49:30 +0100][INFO ]: Waiting 10 seconds for ["DONKEY"] to enter status == offline
[2016-07-27 11:49:30 +0100][DEBUG]: Node DONKEY has status = online
[2016-07-27 11:49:30 +0100][DEBUG]: Node DONKEY has status = online
[2016-07-27 11:49:30 +0100][DEBUG]: Node DONKEY has status = online
[2016-07-27 11:49:30 +0100][DEBUG]: Node DONKEY has status = online
[2016-07-27 11:49:31 +0100][DEBUG]: Node DONKEY has status = online
[2016-07-27 11:49:31 +0100][DEBUG]: Node DONKEY has status = online
[2016-07-27 11:49:31 +0100][DEBUG]: Node DONKEY has status = online
[2016-07-27 11:49:31 +0100][DEBUG]: Node DONKEY has status = online
[2016-07-27 11:49:32 +0100][DEBUG]: Node DONKEY has status = online
[2016-07-27 11:49:32 +0100][DEBUG]: Node DONKEY has status = online
[2016-07-27 11:49:37 +0100][DEBUG]: Node DONKEY has status = offline
      the events should be: start,quorum_vote(failure),job_complete(quorum_failed)

I think we'll turn the loglevel to WARN before shipping, but my hope is that having timestamps and more structured messages will help us debug the unstable tests.