Closed ep1cman closed 3 months ago
@douglas-raillard-arm Have you seen this behavior on your setup?
I haven't seen it at that specific location but in general yes, that makes sense. That's partly why relying on __del__
is a bad idea and context managers should be used.
What we should do is :
EDIT: I lied apparently and did hit the exact same problem: https://gitlab.arm.com/tooling/lisa/-/blob/main/lisa/target.py?ref_type=heads#L1309 https://gitlab.arm.com/tooling/lisa/-/blob/main/lisa/_cli_tools/lisa_load_kmod.py?ref_type=heads#L45
Actually there might be a 3rd way out to ensure __del__
is ran before the namespace starts being torn down. I'll give that a shot today and report back.
@ep1cman Could you give a go at that PR ? It introduces a context manager API for Target
so that you can use:
with Target(...) as target:
...
And the errors should be gone.
If it works for you I'll remove the draft status.
EDIT: PR is updated to also avoid the problem when the context manager API is not used by using an atexit handler.
Unfortunately I am unable to currently re-create the race condition with the garbage collector that caused this error
@ep1cman were you using a version of devlib published before or after March 29th, 1:04 AM GMT ?
Currently pip freeze
shows: devlib @ git+https://github.com/ARM-software/devlib.git@7276097d4e12ff2b3cfa1bb0ba40cee24ae3372b
(The latest commit). I believe this is what I was using when reporting this but I couldn't say for sure if I had tried to update in the process of debugging this.
Ok, I'd say we can merge https://github.com/ARM-software/devlib/pull/685 and it's likely to fix the issue, but maybe what you experienced came from something not covered, as this PR should have fixed the issue for SshConnection specifically.
https://github.com/ARM-software/devlib/pull/685 should fix it for every connection type, and for any other resource held by Target
so it's more robust.
Does you script use multithreading explicitly ? (i.e. more than just what devlib does internally)
The code I am developing mixes devlib with something called "labgrid" than fires off its own asyncio loop, multiple sub processes too so it wouldn't surprise me if that's what made the race condition pop up
Ok maybe it's partially related then, if anything would prevent the atexit handler to execute. According to the Python doc:
Note: The functions registered via this module are not called when the program is killed by a signal not handled by Python, when a Python fatal internal error is detected, or when os._exit() is called.
I can't really imagine that situation being your case though. In any case, the context manager API should provide a way to deal with that cleanly, and in current devlib codebase, calling Target.disconnect()
.
I am hitting a rather odd issue where at the end of my scripts I am seeing a flood of the following messages:
Both
logger
and_handle_paramiko_exceptions
no longer seem to exist.What I think is happening is that during interpreter shut down these have already been garbage collected, and when it comes time to garbage collect
ConnectionBase
,__del__
is called but ends up being unable to run.