Closed epogrebnyak closed 5 years ago
Handout uses a separate logger instance, so you can do this:
import handout
import logging
logging.getLogger('handout').setLevel(logging.ERROR)
I'm also open to changing which messages are printed by default.
For example, I think doc.add_text()
should always print so you can use it to replace print()
where needed. Besides this, the path to the index.html
could be printed once in the beginning to open the handout in the browser. All other messages could be hidden by default (e.g. log level info).
Do you have thoughts on this?
Taking a second look at what you say, it does seem incorrect that the same figure is added multiple times. Have you tried just running your script as python3 script.py
?
It's not entirely clear to me how Python Handout would be supposed to behave in an interactive console. It goes in a different direction than what motivated the library, i.e. to create reports in a reproducible environment. But I'd be happy to talk through how this could work and if we can extend it in a reasonable way.
python example.py
produces single output as inteded.
Changing the logging as suggested above did not help in IPython interactive session, each runfile()
adds up to more lines of output. There is a state somewhere inside the interactive session. :)
Besides this, the path to the index.html could be printed once in the beginning to open the handout in the browser.
Maybe a path should be printed just for show()
method? It would be a clutter to see paths for add_text()
. In my opinion add_text()
should behave like print()
.
Re amount of logging messages, I reopened #2 to track this for the next release.
Re executing from within Spyder, could you describe the setup a bit more, please? I assume runfile()
imports the script and executes it inline so that its variables become available in the interactive environment? Does the generated handout look as you would expect?
There is no global state in Handout, so as long as you create a new handout.Handout()
instance every time you generate a report, I'm not sure where the accumulating state comes from.
There is no global state in Handout indeed - but there is seemingly is some state in IPython logger configuration - it accumulates messages (maybe it is a desired behaviour for a REPL).
Minimal example:
import handout
doc = handout.Handout('output')
doc.add_text('Print this text')
Running once on new kernel in IPython (same output for python minimal.py
):
[1]: runfile('D:/github/handout/minimal.py', wdir='D:/github/handout')
Print this text
But on next runs I get multiple logger messages, one more message per run:
[2]: runfile('D:/github/handout/minimal.py', wdir='D:/github/handout')
Reloaded modules: handout, handout.handout, handout.blocks, handout.tools
Print this text
Print this text
[3]: runfile('D:/github/handout/minimal.py', wdir='D:/github/handout')
Reloaded modules: handout, handout.handout, handout.blocks, handout.tools
Print this text
Print this text
Print this text
P.S. This is not a terrible thing, but rather a nuisance.
Maybe I can fill a question on Stack Overflow to ask for clues.
Thanks a lot for investigating! I'm quite surprised how the resulting handout looks like. Does it show the code lines that were executed in the REPL and in the correct order?
I should have mentioned the html output is normal (the 'handout'), the duplication of messages appears only in the IPython console.
One more line in the example:
import handout
doc = handout.Handout('output')
doc.add_text('Print this text')
doc.show()
On the first run the console output is:
Print this text
Handout written to: output\index.html
On the second:
Print this text
Print this text
Handout written to: output\index.html
Handout written to: output\index.html
Both print() and logger messages get accumulated in the console.
Thanks. I was actually curious how the resulting handout looks like, because there is no source file when running handout from the interactive interpreter.
Besides this, would it be fair to say that this is a problem with IPython (and we can notify them about it) or is there something we can and should do on our end?
no source file when running handout from the interactive interpreter
There is the source when running with runfile('D:/github/handout/minimal.py', wdir='D:/github/handout')
command. At least something with oriignal file content is executed.
I filled an issue with IPython, maybe someone can shed some light on it.
The issue is definitely on IPython side, but maybe we can do some tricky configuration of logger at init to prevent message duplication.
The best of my understanding is that a logger object a session-level singleton, that exists in IPython session. Any time the script is run in the session again the getLogger('handout')
refers to existing instance.
Any time there is a new call _logger.info(message)
we somehow create not just a logging event, but also a steam 'handler', that gets accumulated inside _logger.handlers
.
Would need a bit of more research to find out what to do about it (in principle - randomise creation of a logger name or cut out adding more handlers to the logger).
This reminds me a bit of duplicate logging messages I've seen in TensorFlow. What do you think about getting rid of the logger and just using print()
statements together with a quite=False
option in the constructor?
Let me bear with this issue for a little while - it appears the problem is not even IPython, but it is specific to Spyder. I came up to running the script in IPython session in console and the messages do not duplicate:
In [1]: %run minimal.py
Print this text
In [2]: %run minimal.py
Print this text
Most Stack Overflow answers like this one deal with a situation where there is addHandler()
call in script code and it gets repeated when script is called in the same session. Also there is a solution to set logger.propagate = False
, but injecting it in Handout code had no effect.
The wierd point is to detect where the handlers get added in between the calls to the script. I would like to investigate a bit, if no result, can possibly use print. Maybe wrap print in own helper:
def log(message, logger=None):
```Logging utility to print *message* to console.```
if logger:
logger.info(message)
else:
print(message, silent=False)
Some people may see logging vs printing as a matter of code style, so the decision will be up to you as an author and maintainer of the project.
God, @danijar! ))) you are actually adding a new handler in __init__.py
!
Maybe this code should be inside handout.py
or tools.py
or new logging.py
- __init__.py
is the last place I looked for some configuration interference code, as I searched codebase of Spyder and IPython instead. ))
https://github.com/danijar/handout/blob/master/handout/__init__.py#L1-L13
import sys
import logging
from .handout import Handout
# Set up logger.
logger = logging.getLogger('handout')
logger.setLevel(logging.INFO)
logger.propagate = False # Global logger should not print messages again.
handler = logging.StreamHandler(sys.stdout)
handler.setFormatter(logging.Formatter('%(message)s'))
logger.addHandler(handler)
What happens in that IPython resets logger handlers after %run
command, while Spyder keeps the handlers alive. The reason for this is some ambiguity in logging.shutdown()
behaviour.
As a quick fix, we can do something like below for __init__py
:
import sys
import logging
from .handout import Handout
# Set up logger.
logger = logging.getLogger('handout')
logger.setLevel(logging.INFO)
logger.propagate = False # Global logger should not print messages again.
if not logger.handlers:
handler = logging.StreamHandler(sys.stdout)
handler.setFormatter(logging.Formatter('%(message)s'))
logger.addHandler(handler)
The fix in __init__.py
really works, but I suggest making things more explicit via create_logger()
function. create_logger()
may be placed in tools.py
, for example
import sys
import logging
def create_logger():
logger = logging.getLogger('handout')
logger.setLevel(logging.INFO)
logger.propagate = False # Global logger should not print messages again.
if not logger.handlers:
handler = logging.StreamHandler(sys.stdout)
handler.setFormatter(logging.Formatter('%(message)s'))
logger.addHandler(handler)
# The logger can be access anywhere in code by its name
# with logging.getLogger('handout') but we pass it explcitly
# as function result
return logger
later in Handout constructior:
self._logger = create_logger()
Thanks for investigating! Yes, we shouldn't keep adding handlers :D
The reason to create it in __init__.py
is to make it available once the library is imported so the user can change the logging level, format, etc. So let's include your first solution for now? If you have a better solution, just let me know.
I run example.py in IPython console in Spyder and it seems that console output gets proportional to the number of times I have run the script. On the first run in new kernel I get just single message, one per call of
handout.Handout
instance method.This is what I get on the fifth run:
Maybe not a bug, but it would have felt more confortable the
handout.Handout
instance made just a single print to the screen per call.I run the script as
runfile('D:/github/handout/example.py', wdir='D:/github/handout')