Open aryoda opened 1 year ago
Thanks for analyzing this. I saw that problem, too but couldn't figure out what it is. ;) But you have to help me a bit. I still don't understand why this problem appear in our testing code.
I vote for with
solution. We don't support 3.2. Based on the current Travis configuration we support Python 3.6 or higher. This is enough even for very old Debian. (I am a Debian loving Debian users.) ;)
I still don't understand why this problem appear in our testing code.
A unit tests uses the "object under test" ("OuT", eg. a class or even a single function) only for a short time.
I the unit tests calls an OuT function with Popen()
inside the unit test may finish before the spawned sub process does end because it may take a few "millisecond" until it ends/shuts down. This is non-deterministic (eg. the CPU scheduler decides who gets when and how much CPU time to execute code).
Note that Popen
works async (it returns before the sub process ends) so that the Python code can continue "in parallel".
Edit: So it may never be a problem in BiT itself but only during unit testing. But since unit tests shall succeed and no (unexpected) warning shall occur it is a problem eg. for TravisCI because a failing test may be a "false positive".
Ah thanks. I got it now. ;)
Then maybe such processed should be killed in a unittest.TestCase.TearDown()
(or TearDownClass()
) method. I need to take a closer look. But what is in my mind is that each test_*()
add the process to a member (list) of the test class. And at the end (in the TearDown() method) it iterates on that list and check if that processes should be killed or not. This would prevent us from using a timeout.
IMHO the running subprocess may occur also in BiT (but very rarely and only in case of errors).
I will fix this in the BiT code directly, this is the cleanest solution I think (and not much work).
Given the choice between the two solutions above, I would also prefer with
, because it's much cleaner and (as far as I understand it) exactly fit for this purpose.
Before changing the Popen
calls I have checked the code coverage report to see which Popen
calls are NOT covered by any unit test.
Result:
About 50 % of the Popen
calls are covered by at least one unit test, esp. backup- and ssh-related calls.
The not covered calls are mostly not central to backup functionality.
Overall risk estimation of this change: low
Reasons:
Details: The following methods are currently not covered by a unit test:
qt/serviceHelper.py#111: def _which(self, exe, fallback):
common/encfstools.py#80: def _mount(self):
common/encfstools.py#154: def checkVersion(self):
common/encfstools.py#362: class Encode: def startProcess(self):
common/encfstools.py#585: class Decode: def startProcess(self):
common/snapshots.py#1562: def statFreeSpaceSsh(self):
common/sshMaxArg.py#36: def maxArgLength(config, mid = 1048320):
common/sshtools.py#1058: def sshCopyId(pubkey, user, host, port='22', askPass='backintime-askpass', cipher=None):
common/tools.py#791: dev checkHomeEncrypt()
common/tools.py#1845: dev shutdown(self)
common/tools.py#1862: dev unity7(self)
qt/app#1018.py: def openManPage(self, man_page):
qt/plugins/notifyplugin.py#61: def message(self, profile_id, profile_name, level, message, timeout):
qt/plugins/qt4plugin.py#55 class QtPlugin: def processBegin(self):
qt/qtsystrayicon.py#203: def onStartBIT(self):
qt/snapshotsdialog.py#348: def btnDiffClicked(self):
Symptoms
When I execute unit tests with
common/make test-v
I get sometimes warnings like this (non-deterministically):/usr/lib64/python3.6/subprocess.py:786: ResourceWarning: subprocess 17750 is still running
Expected behavior
Analysis
Looking into the code locations I can see that subprocesses are mostly spawned with
subprocess.Popen()
+communicate()
code pattern to perform async subprocess calls:communicate()
willIn case of errors to
communicate()
with the subprocess it is not killed.Possible solutions
communicate
withproc.kill
after timeoutThere is a documented best-practice code snippet for this in the documentation:
with
to add context management (object lifecycle)Alternatively this code pattern could be used
https://docs.python.org/3/library/subprocess.html#subprocess.Popen
Impact
Using
with
requires context manager support fromPopen()
and this supported only for Python version 3.2 or higher.Using
with
means that BiP will no longer work for older Python versions (before 3.2).Others
Check if every code location with
Popen()
does usecommunicate()
too