openstack-charmers / zaza

A Python3-only functional test framework for Charms
Apache License 2.0
10 stars 47 forks source link

Handling of Juju temporary error conditions #328

Open fnordahl opened 4 years ago

fnordahl commented 4 years ago

Hello,

I have only hit this once, so I do not expect it to hit often, but do we need to catch errors like these and silently retry?

The controller in question happily lives on and is serving other models with Zaza right now.

Traceback (most recent call last):
  File "/home/ubuntu/src/charm-octavia/build/builds/octavia/.tox/func-smoke/bin/functest-run-suite", line 8, in <module>
    sys.exit(main())
  File "/home/ubuntu/src/charm-octavia/build/builds/octavia/.tox/func-smoke/lib/python3.6/site-packages/zaza/charm_lifecycle/func_test_runner.py", line 173, in main
    bundle=args.bundle)
  File "/home/ubuntu/src/charm-octavia/build/builds/octavia/.tox/func-smoke/lib/python3.6/site-packages/zaza/charm_lifecycle/func_test_runner.py", line 118, in func_test_runner
    run_env_deployment(env_deployment, keep_model=preserve_model)
  File "/home/ubuntu/src/charm-octavia/build/builds/octavia/.tox/func-smoke/lib/python3.6/site-packages/zaza/charm_lifecycle/func_test_runner.py", line 68, in run_env_deployment
    config_steps.get(deployment.model_alias, []))
  File "/home/ubuntu/src/charm-octavia/build/builds/octavia/.tox/func-smoke/lib/python3.6/site-packages/zaza/charm_lifecycle/configure.py", line 48, in configure
    run_configure_list(functions)
  File "/home/ubuntu/src/charm-octavia/build/builds/octavia/.tox/func-smoke/lib/python3.6/site-packages/zaza/charm_lifecycle/configure.py", line 37, in run_configure_list
    utils.get_class(func)()
  File "/home/ubuntu/src/charm-octavia/build/builds/octavia/.tox/func-smoke/lib/python3.6/site-packages/zaza/openstack/charm_tests/nova/setup.py", line 53, in manage_ssh_key
    keystone_session = openstack_utils.get_overcloud_keystone_session()
  File "/home/ubuntu/src/charm-octavia/build/builds/octavia/.tox/func-smoke/lib/python3.6/site-packages/zaza/openstack/utilities/openstack.py", line 384, in get_overcloud_keystone_session
    get_overcloud_auth(model_name=model_name),
  File "/home/ubuntu/src/charm-octavia/build/builds/octavia/.tox/func-smoke/lib/python3.6/site-packages/zaza/openstack/utilities/openstack.py", line 1634, in get_overcloud_auth
    model_name=model_name)
  File "/home/ubuntu/src/charm-octavia/build/builds/octavia/.tox/func-smoke/lib/python3.6/site-packages/zaza/openstack/utilities/openstack.py", line 1505, in get_application_config_option
    model_name=model_name)
  File "/home/ubuntu/src/charm-octavia/build/builds/octavia/.tox/func-smoke/lib/python3.6/site-packages/zaza/__init__.py", line 48, in _wrapper
    return run(_run_it())
  File "/home/ubuntu/src/charm-octavia/build/builds/octavia/.tox/func-smoke/lib/python3.6/site-packages/zaza/__init__.py", line 36, in run
    return task.result()
  File "/home/ubuntu/src/charm-octavia/build/builds/octavia/.tox/func-smoke/lib/python3.6/site-packages/zaza/__init__.py", line 47, in _run_it
    return await f(*args, **kwargs)
  File "/home/ubuntu/src/charm-octavia/build/builds/octavia/.tox/func-smoke/lib/python3.6/site-packages/zaza/model.py", line 513, in async_get_application_config
    return await model.applications[application_name].get_config()
  File "/home/ubuntu/src/charm-octavia/build/builds/octavia/.tox/func-smoke/lib/python3.6/site-packages/juju/application.py", line 244, in get_config
    return (await app_facade.Get(application=self.name)).config
  File "/home/ubuntu/src/charm-octavia/build/builds/octavia/.tox/func-smoke/lib/python3.6/site-packages/juju/client/facade.py", line 471, in wrapper
    reply = await f(*args, **kwargs)
  File "/home/ubuntu/src/charm-octavia/build/builds/octavia/.tox/func-smoke/lib/python3.6/site-packages/juju/client/_client8.py", line 957, in Get
    reply = await self.rpc(msg)
  File "/home/ubuntu/src/charm-octavia/build/builds/octavia/.tox/func-smoke/lib/python3.6/site-packages/juju/client/facade.py", line 607, in rpc
    result = await self.connection.rpc(msg, encoder=TypeEncoder)
  File "/home/ubuntu/src/charm-octavia/build/builds/octavia/.tox/func-smoke/lib/python3.6/site-packages/juju/client/connection.py", line 456, in rpc
    raise errors.JujuAPIError(result)
juju.errors.JujuAPIError: getting state: getting storage provider registry: authentication failed.: authentication failed
caused by: requesting token: failed executing the request http://10.245.161.156:5000/v3/auth/tokens
caused by: Post http://10.245.161.156:5000/v3/auth/tokens: EOF
fnordahl commented 3 years ago

Another example:

2020-09-23 08:17:15 [INFO] Waiting for all units to be idle
2020-09-23 08:18:55 [WARNING] RPC: Connection closed, reconnecting
2020-09-23 08:18:55 [WARNING] Receiver: Connection closed, reconnecting
2020-09-23 08:19:55 [WARNING] Receiver: Connection closed, reconnecting
Traceback (most recent call last):
  File "/tmp/tmp.EF9HCSyRN5/func-smoke/bin/functest-run-suite", line 8, in <module>
    sys.exit(main())
  File "/tmp/tmp.EF9HCSyRN5/func-smoke/lib/python3.5/site-packages/zaza/charm_lifecycle/func_test_runner.py", line 204, in main
    force=args.force)
  File "/tmp/tmp.EF9HCSyRN5/func-smoke/lib/python3.5/site-packages/zaza/charm_lifecycle/func_test_runner.py", line 141, in func_test_runner
    force=force)
  File "/tmp/tmp.EF9HCSyRN5/func-smoke/lib/python3.5/site-packages/zaza/charm_lifecycle/func_test_runner.py", line 83, in run_env_deployment
    config_steps.get(deployment.model_alias, []))
  File "/tmp/tmp.EF9HCSyRN5/func-smoke/lib/python3.5/site-packages/zaza/charm_lifecycle/configure.py", line 48, in configure
    run_configure_list(functions)
  File "/tmp/tmp.EF9HCSyRN5/func-smoke/lib/python3.5/site-packages/zaza/charm_lifecycle/configure.py", line 37, in run_configure_list
    utils.get_class(func)()
  File "/tmp/tmp.EF9HCSyRN5/func-smoke/lib/python3.5/site-packages/zaza/openstack/charm_tests/vault/setup.py", line 140, in auto_initialize
    states=test_config.get('target_deploy_status', {}))
  File "/tmp/tmp.EF9HCSyRN5/func-smoke/lib/python3.5/site-packages/zaza/__init__.py", line 48, in _wrapper
    return run(_run_it())
  File "/tmp/tmp.EF9HCSyRN5/func-smoke/lib/python3.5/site-packages/zaza/__init__.py", line 36, in run
    return task.result()
  File "/usr/lib/python3.5/asyncio/futures.py", line 274, in result
    raise self._exception
  File "/usr/lib/python3.5/asyncio/tasks.py", line 239, in _step
    result = coro.send(None)
  File "/tmp/tmp.EF9HCSyRN5/func-smoke/lib/python3.5/site-packages/zaza/__init__.py", line 47, in _run_it
    return await f(*args, **kwargs)
  File "/tmp/tmp.EF9HCSyRN5/func-smoke/lib/python3.5/site-packages/zaza/model.py", line 986, in async_wait_for_application_states
    timeout=timeout)
  File "/tmp/tmp.EF9HCSyRN5/func-smoke/lib/python3.5/site-packages/juju/model.py", line 720, in block_until
    raise websockets.ConnectionClosed(1006, 'no reason')
websockets.exceptions.ConnectionClosed: WebSocket connection is closed: code = 1006 (connection closed abnormally [internal]), reason = no reason
Exception ignored in: <bound method BaseEventLoop.__del__ of <_UnixSelectorEventLoop running=False closed=True debug=False>>
Traceback (most recent call last):
  File "/usr/lib/python3.5/asyncio/base_events.py", line 431, in __del__
  File "/usr/lib/python3.5/asyncio/unix_events.py", line 58, in close
  File "/usr/lib/python3.5/asyncio/unix_events.py", line 139, in remove_signal_handler
  File "/usr/lib/python3.5/signal.py", line 47, in signal
TypeError: signal handler must be signal.SIG_IGN, signal.SIG_DFL, or a callable object
2020-09-23 08:19:55 [ERROR] Task was destroyed but it is pending!
task: <Task pending coro=<Connection._connect.<locals>._try_endpoint() running at /tmp/tmp.EF9HCSyRN5/func-smoke/lib/python3.5/site-packages/juju/client/connection.py:601> wait_for=<Future pending cb=[BaseSelectorEventLoop._sock_connect_done(7)(), Task._wakeup()]> cb=[as_completed.<locals>._on_completion() at /usr/lib/python3.5/asyncio/tasks.py:478]>
2020-09-23 08:19:55 [ERROR] Task was destroyed but it is pending!
task: <Task pending coro=<Connection.reconnect() running at /tmp/tmp.EF9HCSyRN5/func-smoke/lib/python3.5/site-packages/juju/client/connection.py:591> wait_for=<Future pending cb=[Task._wakeup()]>>
Exception ignored in: <coroutine object Connection.reconnect at 0x7f7c2dc49308>
Traceback (most recent call last):
  File "/tmp/tmp.EF9HCSyRN5/func-smoke/lib/python3.5/site-packages/juju/client/connection.py", line 591, in reconnect
  File "/tmp/tmp.EF9HCSyRN5/func-smoke/lib/python3.5/site-packages/juju/client/connection.py", line 652, in _connect_with_login
  File "/tmp/tmp.EF9HCSyRN5/func-smoke/lib/python3.5/site-packages/juju/client/connection.py", line 612, in _connect
  File "/usr/lib/python3.5/asyncio/tasks.py", line 488, in _wait_for_one
  File "/usr/lib/python3.5/asyncio/queues.py", line 170, in get
  File "/usr/lib/python3.5/asyncio/futures.py", line 227, in cancel
  File "/usr/lib/python3.5/asyncio/futures.py", line 242, in _schedule_callbacks
  File "/usr/lib/python3.5/asyncio/base_events.py", line 497, in call_soon
  File "/usr/lib/python3.5/asyncio/base_events.py", line 506, in _call_soon
  File "/usr/lib/python3.5/asyncio/base_events.py", line 334, in _check_closed
RuntimeError: Event loop is closed