juju-solutions / bundle-cwr-ci

Deploy the Juju Charm CI system
Other
2 stars 6 forks source link

Matrix borks #19

Closed merlijn-sebrechts closed 7 years ago

merlijn-sebrechts commented 7 years ago

CWR tests run fine. Matrix borks. I have no idea how to proceed.

Job:

juju run-action cwr/0 cwr-charm-commit \
    repo=http://github.com/tengu-team/layer-limeds \
    charm-name=limeds \
    push-to-channel=stable \
    namespace=tengu-team \
    reference-bundle=cs:~tengu-team/limeds-core

Errors:

2017-03-09 14:13:14 DEBUG matrix:464:add_model: Creating model -matrix-large-kit
2017-03-09 14:13:14 DEBUG matrix:380:run: Error adding model: failed to create config: creating config from values failed: "-matrix-large-kit" is not a valid name: model names may only contain lowercase letters, digits and hyphens
2017-03-09 14:13:14 DEBUG ------------------------------------------------------------------------------
2017-03-09 14:13:14 DEBUG matrix:431:cleanup: Error while running crashdump.
2017-03-09 14:13:14 DEBUG Traceback (most recent call last):
2017-03-09 14:13:14 DEBUG   File "/usr/local/lib/python3.5/dist-packages/matrix/rules.py", line 421, in cleanup
2017-03-09 14:13:14 DEBUG     model_name = context.juju_model.info.name
2017-03-09 14:13:14 DEBUG AttributeError: 'Attribute' object has no attribute 'info'
2017-03-09 14:13:14 DEBUG matrix:439:cleanup: Error destroying model: 'Attribute' object has no attribute 'info'
2017-03-09 14:13:14 DEBUG Traceback (most recent call last):
2017-03-09 14:13:14 DEBUG   File "/usr/local/lib/python3.5/dist-packages/matrix/rules.py", line 436, in cleanup
2017-03-09 14:13:14 DEBUG     await self.destroy_model(context)
2017-03-09 14:13:14 DEBUG   File "/usr/local/lib/python3.5/dist-packages/matrix/rules.py", line 500, in destroy_model
2017-03-09 14:13:46 DEBUG RuntimeError: Event loop is closed
2017-03-09 14:13:46 DEBUG Exception ignored in: <generator object Queue.get at 0x7fc269e7ae08>
2017-03-09 14:13:46 DEBUG Traceback (most recent call last):
2017-03-09 14:13:46 DEBUG   File "/usr/lib/python3.5/asyncio/queues.py", line 170, in get
2017-03-09 14:13:46 DEBUG     getter.cancel()  # Just in case getter is not done yet.
2017-03-09 14:13:46 DEBUG   File "/usr/lib/python3.5/asyncio/futures.py", line 227, in cancel
2017-03-09 14:13:46 DEBUG     self._schedule_callbacks()
2017-03-09 14:13:46 DEBUG   File "/usr/lib/python3.5/asyncio/futures.py", line 242, in _schedule_callbacks
2017-03-09 14:13:46 DEBUG     self._loop.call_soon(callback, self)
2017-03-09 14:13:46 DEBUG   File "/usr/lib/python3.5/asyncio/base_events.py", line 497, in call_soon
2017-03-09 14:13:46 DEBUG     handle = self._call_soon(callback, args)
2017-03-09 14:13:46 DEBUG   File "/usr/lib/python3.5/asyncio/base_events.py", line 506, in _call_soon
2017-03-09 14:13:46 DEBUG     self._check_closed()
2017-03-09 14:13:46 DEBUG   File "/usr/lib/python3.5/asyncio/base_events.py", line 334, in _check_closed
2017-03-09 14:13:46 DEBUG     raise RuntimeError('Event loop is closed')
2017-03-09 14:13:46 DEBUG RuntimeError: Event loop is closed
2017-03-09 14:13:46 DEBUG Exit Code: 200

Full log: http://pastebin.ubuntu.com/24146234/

pengale commented 7 years ago

@kwmonroe that looks like a snafu with the prefix that's getting passed into matrix. It looks like it's missing the test id ...

kwmonroe commented 7 years ago

Aaaaagggghhh @petevg!

@galgalesh, my humblest apologies! This week, we were focussed on an issue where a user could cancel a jenkins job and the matrix models would be left abandoned (potentially costing people real cloud money). To combat this, we introduced a predictable prefix for all models related to a jenkins job, and we clean those up regardless of how the job completes.

I broke the piece that passes the model prefix down to matrix, which resulted in a model name that starts with a hyphen, which is not allowed by juju. Hence, your error.

The fix came in today with:

https://github.com/juju-solutions/layer-cwr/commit/724a0711ac827b81e1c14b36a0d3501e5fb6dc2d#diff-af2ce77cfb836652097fe60bbb4ea4fcR342

This has been released to the stable channel as cwr-70. Sorry again for the busted UX!!!

merlijn-sebrechts commented 7 years ago

No prob. How do I get this fix? Just upgrade the charm or.. ?

kwmonroe commented 7 years ago

Hey @galgalesh - you can get the fix with:

juju upgrade-charm cwr

Your timeline suggests to me that you are running cwr-65, which is pretty close to cwr-70 (meaning we didn't break the action api between those versions). Any jobs created by actions on cwr-65 should run successfully on cwr-70. Of course please let us know if that is not the case.

Worst case scenario: your jobs fail with cwr-70, at which point you would need to delete the job from the Jenkins UI and re-run the action (cwr-charm-[commit|release|pr|etc]).

merlijn-sebrechts commented 7 years ago

Thanks! Any idea when this stuff will stabilize? Plans to promulgate this bundle?

kwmonroe commented 7 years ago

What's wrong @galgalesh? You don't like tearing down / spinning up a CI/CD system every couple days?!?

I kid, of course. Our dev cycle is aligned with ubuntu releases, which means new cycles start every April / October. My best guess at the moment is that we'll be stable before the new cycle starts sometime in April. I know that's not very concrete, but we still have lots to do to address the current open issues:

https://github.com/juju-solutions/layer-cwr/issues

And yes, the current plan is to promulgate the cwr charm and cwr-ci bundle -- if people would stop finding bugs and requesting features, that should happen at the end of this cycle as well ;)