juju / python-libjuju

Python library for the Juju API
Apache License 2.0
59 stars 100 forks source link

awaited juju Actions that are complete or failed, don't reflect that status #719

Closed addyess closed 2 years ago

addyess commented 2 years ago

action.wait() returns itself with updated action.results, but the action.data is stale.

this can result in situations where the results have valid data, but action.success may say "pending" or "running" when in reality it's "completed" or "failed".

addyess commented 2 years ago

https://github.com/juju/python-libjuju/pull/717 was an attempt to fix this but failed with further attention

cderici commented 2 years ago

@addyess could you please confirm that this is still happening, so I'll put it in my queue? Thanks!

addyess commented 2 years ago

@cderici yes unless there's a new juju release after 3.0.1, this issue still plagues us. I'd think something as simple as a test like this would fail

async def test_awaited_action_completed():
   ubuntu = juju.deploy("ubuntu")
   ...
   action = await ubuntu.units[0].run("sleep 30")
   before = datetime.now()
   result = await action.wait()
   assert result.success == "completed", "Shouldn't be 'pending' -- only completed or failed"
   after = datetime.now()
   # test (after - before) >= 30 seconds
jneo8 commented 2 years ago

Same issue on https://launchpad.net/charm-infra-node

addyess commented 2 years ago

I decided to test again with juju==3.0.2 and the results have shifted a bit

        before = datetime.now()
        action = await unit.run("sleep 30; echo 'Done'")
        action = await action.wait()
        delta = datetime.now() - before
        assert action.status == "completed", f"Should be running after 30, but was {action.status} after {delta}"

The test fails like this, so it indicates the sleep ran to completion

Should be running after 30, but was running after 0:00:33.946744

You can see it completed the echo

>>> action.results
{'return-code': 0, 'stdout': 'Done\n'}
>>> action.safe_data
{'model-uuid': '6d0284b0-6043-42d5-8...c7cc9584ee', 'id': '95', 'receiver': 'kubernetes-control-plane/0', 'name': 'juju-run', 'status': 'running', 'message': '', 'enqueued': '2022-10-21T17:30:25Z', 'started': '2022-10-21T17:30:25Z', 'completed': '0001-01-01T00:00:00Z'}
>>> action.data
{'model-uuid': '6d0284b0-6043-42d5-8...c7cc9584ee', 'id': '95', 'receiver': 'kubernetes-control-plane/0', 'name': 'juju-run', 'status': 'running', 'message': '', 'enqueued': '2022-10-21T17:30:25Z', 'started': '2022-10-21T17:30:25Z', 'completed': '0001-01-01T00:00:00Z'}
addyess commented 2 years ago

its interesting that in model.py

    async def get_action_output(self, action_uuid, wait=None):
        """ Get the results of an action by ID.

        :param str action_uuid: Id of the action
        :param int wait: Time in seconds to wait for action to complete.
        :return dict: Output from action
        :raises: :class:`JujuError` if invalid action_uuid
        """
        action = await self._get_completed_action(action_uuid, wait=wait)
        # ActionResult.output is None if the action produced no output
        return {} if action.output is None else action.output

the variable action ACTUALLY HAS the completed=True, but only the actrion.output is returned from this method