juju-solutions / matrix

Automatic testing of big software deployments under various failure conditions
Other
8 stars 9 forks source link

No units in model - timeout #92

Closed ktsakalozos closed 7 years ago

ktsakalozos commented 7 years ago

I go the following error: http://pastebin.ubuntu.com/24130907/

How can I assist in figuring this out?

pengale commented 7 years ago

Looks like @seman might have a repro:

http://juju-ci.vapour.ws/view/CWR/job/cwr-aws/2104/console

seman commented 7 years ago

This seems to occur more often: https://pastebin.canonical.com/184080/ https://pastebin.canonical.com/184082/

pengale commented 7 years ago

@seman @ktsakalozos I suspect what is happening here is that python-libjuju is losing the websocket connection, but not telling us about it. Fixing it is going to be interesting. Adding a Pinger might help a bit, but I think that we also need to figure out a sensible "reconnect and resume" routine, and figure out where to stick it.

pengale commented 7 years ago

Moved to Beta milestone, as I think that it is important; most of the work is probably going to happen on the python-libjuju side.

pengale commented 7 years ago

Opened https://github.com/juju/python-libjuju/issues/99 in python-libjuju. Setting to "blocked" as this is probably best addressed in python-libjuju.

pengale commented 7 years ago

See updates to https://github.com/juju/python-libjuju/issues/99 for some thinking on this ...

pengale commented 7 years ago

This should be fixed, as of the merge that I just made.