juju-solutions / matrix

Automatic testing of big software deployments under various failure conditions
Other
8 stars 9 forks source link

Added leadership to implicit selectors. #11

Closed pengale closed 7 years ago

pengale commented 7 years ago

To make this work, I had to refactor the way that we do the fetching and composing of units for the implicit selectors. It's not as elegant or flexible any more, but it does support asynchrounous operations, which was critical.

I also had to make selectors asynchronous, and get rid of type checking for their returns -- verifying that they all return a coroutine object isn't as interesting as being able to verify what that coroutine actually returns ...

@johnsca @bcsaller

pengale commented 7 years ago

Per discussion w/ bcsaller, we don't actually want asynchronous selectors -- I'm going to work on a commit that will track leadership in python-libjuju's model instead.

pengale commented 7 years ago

@bcsaller I ran into an interesting issue w/ integrating leadership into python-libjuju.

Specifically, we were forgetting that leadership is not actually a juju feature! It's a layer feature, and all the info lives in the relation data. That means that we don't get any deltas related to leadership over the juju websocket api.

I see two possibilities:

1) I am missing a clever way of getting at the relation data over the websocket api. 2) The implementation in the PR is actually the correct implementation. (My first instinct is to try to add caching ... but glitch does stuff that specifically would invalidate cached leader data by triggering the election of a new leader ... it's probably not a great idea to cache it.)

bcsaller commented 7 years ago

This isn't true. https://github.com/juju/juju/blob/staging/cmd/juju/status/output_tabular.go#L180 shows leader status being added to the output of units. I haven't verified this lives in the AllWatcher delta stream or not but leadership is first class and if for some reason it isn't there we can argue that it should be. Ask someone in core for a clear answer here. I can't dig into this from the sprint right now.

pengale commented 7 years ago

bcsaller: You're correct -- there are facades related to leadership in the api. I'm a bit sidetracked debugging deploy stuff in python-libjuju (the hadoop-processing bundle exposes some bugs), but I think the code is something like this:

        u = client.UniterFacade()
        u.connect(self.model.connection)
        await u.WatchLeadershipSettings(entity)

Where entity is either an application or a unit (need to test both, and see which works -- it's a little bit unclear from just reading the code.)

pengale commented 7 years ago

@bcsaller @johnsca Per my convo w/ Ben and then on the #juju-dev channel last Friday, I refactored to using fullstatus, as leadership is not exposed directly over the API.

At some point, we might want to cache the return, so that we're not calling it so much. It isn't causing a noticeable delay in the runtime of matrix, however, so I'd like to leave caching as a separate exercise (we're messing with things, so we have to be careful about the cache not falling out of date).