canonical / testflinger

https://testflinger.readthedocs.io/en/latest/
GNU General Public License v3.0
12 stars 20 forks source link

removing or moving agent logfiles during test phase confuses the agent #372

Open plars opened 1 month ago

plars commented 1 month ago

While the agent runs through the test phases, it keeps the original job spec as a json file, along with log files for each phase in the root of the agent execution directly. Then at the end of the job, it reads in the pieces it needs and sends the final results to the server before marking the job complete. A recent job we saw caused a problem with this, by doing this at the end of their job to collect all of their own logs into artifacts:

mkdir artifacts
mv * artifacts

Predictably, this crashed the agent because it could no longer find it's own files that it expected to be there:

[24-10-07 09:08:53]   ERROR: (agent.py:296)| [Errno 2] No such file or directory: '.../run/d21e3feb-2465-484c-b69e-cb503ef94ce2/testflinger-outcome.json'
Traceback (most recent call last):
  File "/.../agent.py", line 270, in process_jobs
    exit_code, exit_event, exit_reason = job.run_test_phase(
  File "/.../job.py", line 128, in run_test_phase
    self._update_phase_results(
  File "/.../job.py", line 151, in _update_phase_results
    with open(results_file, "r+") as results:
FileNotFoundError: [Errno 2] No such file or directory: '.../run/d21e3feb-2465-484c-b69e-cb503ef94ce2/testflinger-outcome.json'

We could set cwd to a different path, but would make sure to pull artifacts from the right location if we do that. We might also want to consider whether it makes more sense to switch to transmitting results as it runs. That's something we'd also like to look into, but I suspect it might have a few more corner cases lurking that we need to consider for cases where the transmission fails or is interrupted.

syncronize-issues-to-jira[bot] commented 1 month ago

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/CERTTF-429.

This message was autogenerated