Open jayvdb opened 5 years ago
Scattershot of questions:
What about applications that use other than the .basicConfig()
logging setup? Does it still misbehave? Will this case need to be handled separately?
I anticipate using non-.basicConfig()
logging for #43 -- does this change how this needs to be handled?
Hopefully a grep -rE 'sys[.]std(in|out|err)' *.py
in the CPython repo would suffice to identify all such stdlib references to the streams?
$ grep -rE 'sys[.]std(in|out|err)' lib/* | grep -v '/test/' | wc -l
400
To what extent do we have to worry about interference from the C portions of CPython ?
shutdown snippet for py34-37
try:
logging.config._clearExistingHandlers()
except AttributeError: # pragma Python 3.6,3.7: no cover
logging._handlers.clear()
logging.shutdown(logging._handlerList[:])
del logging._handlerList[:]
try:
logging.Logger.manager._clear_cache()
except AttributeError: # pragma Python 3.7: no cover
logging.Logger.root = logging.root = logging.RootLogger(
logging.WARNING)
logging.Logger.manager = logging.Manager(logging.Logger.root)
Any import logging
causes the problem, as the references to our fake streams is done during module loading.
If we import logging
first those references will be to the 'real' streams, or at least the streams as they existed when stdio_mgr
was imported. That is the easy fix, but only for the one specific problem.
The "400" isnt so scary as many of them are only using the streams. The problem is stored references to the streams, especially references created a module import (class creation) phase. This includes any time sys.std*
are used as an argument default as those defaults are in the class object.
I havent scanned the C portion, so I dont know the size of the problem there.
Rather than finding them all, which would still only encompass the stdlib, I think we need to first focus on detecting the problem (and issuing warnings, or raising exceptions, etc).
One way that might work is for the streams pushed into sys
to be reference counted, e.g. using weakref
's or sys.getrefcount
, and the __exit__
complain bitterly if any of those streams still have living objects referring to them.
Technically this is post-existing stream states in the case of pytest
+ logging
because pytest doesnt import logging. But a different test runner would import logging
, and a whole new slew of problems would be caused ...
One way that might work is for the streams pushed into
sys
to be reference counted, e.g. usingweakref
's orsys.getrefcount
, and the__exit__
complain bitterly if any of those streams still have living objects referring to them.
If my mental model of Python name assignment is correct, I suspect there's not any way to invisibly "intercept" assignments to our mocked objects, such that we could track such assignments and then silently "re-target" the assignments once we're ready to __exit__
, or whatever?
The GC has to know both ends of assignments in order to detect cycles...could we hijack that machinery somehow? [...looks...] Looks like GC.get_referrers
might help, though it specifically says not to use it for production code. :-P
If my mental model of Python name assignment is correct, I suspect there's not any way to invisibly "intercept" assignments to our mocked objects, such that we could track such assignments and then silently "re-target" the assignments once we're ready to
__exit__
, or whatever?
But -- what if we added a toggle flag to our object, default to False
, that when switched to True
turns it into a silent, complete pass-through to the .prior_stdfoo
? I don't think we can reach out to other code and rebind their references, but if we can just switch our objects, which those references remain bound to, to a no-op pass-through to the old objects... might work ok?
Could foul some is
tests in that external code... but having is
be False
is probably good... we could try to set it up so that ==
is True
but is
is False
? Gut-feel, that seems like a sensible way for it to work.
As special case of https://github.com/bskinn/stdio-mgr/issues/71 , and likely to be one of several cases, stdlib
logging
creates a reference to sys.error on import. As a result the following fairly common scenario of using amain()
to test a cli without subprocesses fails if logging is first imported within the context:After the context handler exits, the wrapper streams are closed, and the logging is still using them.
Easy solution: stdio_mgr imports
logging
, like we would need to for https://github.com/bskinn/stdio-mgr/issues/43Even the following doesnt dislodge the problem:
To clear it properly we need to do:
or the following also seems to work
We can hide that logging cleanup inside of StdioManager. It also works when
import colorama
is added to that mixture, but I do not feel comfortable that it is solved.Thankfully there doesnt seem to be too many references to sys.std* in the stdlib, so we might be able to check and workaround them all, whenever feasible and appropriate.