juju-solutions / layer-cwr

Layer for building the Juju Jenkins CI env
Other
2 stars 5 forks source link

App status didn't update immediately on register controller #57

Open johnsca opened 7 years ago

johnsca commented 7 years ago

For some reason, when I registered the controller, it still said blocked even though the action was successful. The next update-status hook that triggered fixed the status. From what I can see in the code, it ought to work.

06 Feb 2017 17:03:08-05:00  juju-unit   executing   running action register-controller                   
06 Feb 2017 17:03:19-05:00  workload    blocked     Waiting for controller registration.                 
06 Feb 2017 17:03:19-05:00  juju-unit   idle                                                             
06 Feb 2017 17:08:07-05:00  juju-unit   executing   running update-status hook                           
06 Feb 2017 17:08:08-05:00  workload    active      Ready (controllers: ['lxd']; store: unauthenticated).
kwmonroe commented 7 years ago

Yeah, I've seen this too.. I registered a controller at 19:35:00, the job was created, and the action succeeded. However, from the log, the job didn't actually run until update-status ran 4.75 minutes later:

unit-cwr-0: 19:35:01 INFO unit.cwr/0.register-controller Jenkins job 1 not running yet
unit-cwr-0: 19:39:48 INFO unit.cwr/0.juju-log Reactive main running for hook update-status
unit-cwr-0: 19:39:49 INFO unit.cwr/0.juju-log Import of jenkins layer functionality failed.This is expected for Jenkins clients.
unit-cwr-0: 19:39:49 INFO unit.cwr/0.juju-log Invoking reactive handler: reactive/cwr.py:172:controllers_updated
unit-cwr-0: 19:39:49 INFO unit.cwr/0.juju-log Controllers file has changed
unit-cwr-0: 19:39:50 INFO unit.cwr/0.juju-log Invoking reactive handler: reactive/cwr.py:182:jenkins_available
ktsakalozos commented 7 years ago

When we call a register-controller action we are waiting for 60secs for a controller-registration jenkins job to finish. That jenkins job will do a juju register and write the name of the controller to controller.names file (https://github.com/juju-solutions/layer-cwr/blob/master/jobs/RegisterController/config.xml#L35). Then the register-controller action will read this file (inside the report status https://github.com/juju-solutions/layer-cwr/blob/master/lib/utils.py#L125) and change the status message.

Could it be possible that the controller.names file is not synced at the time report_status reads it?

We recently moved from calling the update-status hook to calling report_status directly from the actions. So I suggest if we see this happening again we do a sync on the controller.names after writing to it. At the moment i cannot reproduce the issue and I am not sure if delayed writes is the problem, just speculating.

johnsca commented 7 years ago

Honestly, why are we even deferring to a Jenkins job to register a controller? That should be handled entirely in charm code. If we want to have a Jenkins job that also registers a controller, it should simply call the charm code that the action uses; but I'm not actually convinced that the "jobs as an API" really buys us anything.

ktsakalozos commented 7 years ago

The "jobs as an API" pattern served its purpose and I also believe we should move away from this. It helped in the early stages of the charm to move forward quickly. I had to just edit the jenkins jobs in place to get the functionality I needed. Now, having all the functionality in (python?) scripts living outside the jenkins jobs will give us the following two advantages:

The build-bundle action served as a test field to verify the above points. Your thoughts?