Use new logging module throughout PyLNP

This will be an ongoing upgrade for several areas of code, but the mods module is particularly relevant as knowing where something went wrong is otherwise almost impossible. In general anything which may leave an invalid state after an error should have some logging available.

[Issue created by PeridexisErrant: 2015-03-05] [Last updated on bitbucket: 2015-12-04]

[Comment created by Pidgeot: 2015-12-04] Replace remaining print statements with log calls (fixes #70)

→ <>

[Comment created by Pidgeot: 2015-05-09]

Is the Qt GUI not threaded? I assumed so due to the talk about threads and the fact that it's pretty responsive.

AFAIK, the issue with Qt had to do with the fact that a Qt thread is subtly different from a standard Python thread, and this caused certain things to break - nothing else. Dirk should know more, though.

Making extra instances of Log could be troublesome since they aren't recorded in the main lines and doesn't output to the console by default, but I'm sure it could be modified a bit to work.

If you ask your new instance to output to stderr (set output_err to True), then it'll get picked up by the errorlog module to make it visible to the user (File > Output log). This isn't the default, because it's not a given that all log output needs to go to stderr (when dealing with a custom instance, that is).

As far as usability, I think the logger should support *args and interpolate those into the message, e.g. log.d("Oh no the value should be %s not %s", x, y)

Makes perfect sense. I'll get that in there.

[Comment created by jmorgensen: 2015-05-09] Is the Qt GUI not threaded? I assumed so due to the talk about threads and the fact that it's pretty responsive.

I agree that most of the core code shouldn't be threaded. I'm not even sure downloads should be in core, but that's another matter.

Making extra instances of Log could be troublesome since they aren't recorded in the main lines and doesn't output to the console by default, but I'm sure it could be modified a bit to work.

As far as usability, I think the logger should support *args and interpolate those into the message, e.g. log.d("Oh no the value should be %s not %s", x, y)

[Comment created by Pidgeot: 2015-05-09]

I think the current log system takes care of 1, but not 2.

In the case of 2: If you have absolutely no idea where it is, that's what the -d switch is for. If you know it's a particular operation, that's what push/pop_level is for.

Everything becomes threaded when you have a responsive GUI.

But we don't really have that, so I'm not sure what you're trying to say here. Other than downloading, every operation (intentionally!) blocks to avoid an unsafe situation where two operations try to do conflicting things.

[Comment created by jmorgensen: 2015-05-09] So I see two use cases:

When developing a certain feature, in order to get at parts of it as you work it.
When changes introduce a bug, and you don't know where the problem is.

I think the current log system takes care of 1, but not 2. Unless you are familiar with where certain log messages are (and there probably aren't even enough logging messages to actually help with this to begin with), it is difficult to find the error. And as for 1, yes logging is more verbose, but I don't think that's necessarily a bad thing.

For filtering by location, you can do getLogger('root').setFilter(Filter('foo.mynewlevel')) to only log messages from that location (and descendants). That's not pretty, but it only needs to be set once.

As for having to store a logger object, it's "best practices" to have a module-level logger = logging.getLogger(__name__), which seems reasonable to me.

Everything becomes threaded when you have a responsive GUI. To me, it makes the most sense to use a standard solution that's already solved all the problems, and if necessary, provide convenience methods to it.

That being said, the current way of logging is better than none, and at this point improving what's logged is probably more important than the logging system.

[Comment created by Pidgeot: 2015-05-09] To take your points in reverse order...

Good point about combining StreamHandler and StringIO; I didn't think about that one.

There's no use of push_level and pop_level because they're intended for in-progress code. Outside of highly experimental features in need of widespread testing, they generally should not be used outside of local modifications.

When you're writing a new system - let's say something along the lines of the mod merging - you'll want a way to learn more when problems occur. That's where logging comes in. Since you don't want all of that extra logging to show up outside of debugging, you use an appropriate log level (let's say DEBUG).

You could then choose to use -d and just get all the DEBUG messages - but then you're going to get debug output from all the other parts of the program as well, and that's just going to add noise to the log, especially if your problem only happens after certain other steps have been followed. You want to be able to focus your attention on that specific piece of code; the -d switch is more useful if you don't know what exactly is breaking.

You can accomplish this temporary elevation with the standard logging module by creating a new logger and using that, but now you need to store that new logger somewhere (or re-request it everytime it's needed), and I don't think that's an appropriate requirement for typical PyLNP code.

For special cases - i.e. threaded code - there's no way to accomplish this without resorting to storing a separate logger, but you can still do that even with the current system: there is an underlying class containing all of the functionality which you can instantiate; the module-level methods just expose the methods of the default instance. The currently threaded code probably should do this, but it's not been a priority at all because the threaded code is all utility code which deals with isolated, well-defined tasks (unlike e.g. stuff that touches raws, because the underlying state machine or whatever DF uses to parse raws doesn't have a complete specification), so it hasn't been necessary.

[Comment created by jmorgensen: 2015-05-08] As for level, perhaps I don't get the issue? If you want to just change the level for a subset of logging calls, the idiomatic way with logging is to call setLevel on a sub logger (e.g. 'foo.mynewlevel' above), which only sets the logging level for that sub logger. If you want to run it with debug sometimes but not other times, your current -d switch works well. I guess I don't see great usefuless in changing the log level at runtime?

It's not implemented directly because saving all log messages in memory is generally a bad idea, unless you know you won't be running for long or that you don't generate many log messages. However it is trivial with StreamHandler: simply pass it an instance of io.StringIO to store all the messages.

[Comment created by Pidgeot: 2015-05-08]

I'll admit it's a very small difference, and I don't have a big problem with it myself - but I'm not the only one writing code here, either.
That controls prefixes, but I'm talking about levels - e.g. WARNING, DEBUG, ERROR, since only messages above a certain level are output. This is particularly useful during development, and the easier it is to remove a temporary elevation, the better. I'll grant you the threading issue (but ATM, I'd rather just add a "getLog" method to the current module).
I did look at all the handlers, but I can't find a handler that lets you do that. The only one that stores the messages in memory is MemoryHandler, but that only lets you output to another handler - it doesn't provide a persistent view of the log messages output during the lifetime of the process (you'd need a custom handler for that).

[Comment created by jmorgensen: 2015-05-08]

Perhaps it's preference, but I don't see logger.warning("Can't process, expected %s", some_var) being too verbose. If absolutely necessary, yes it could be wrapped to alias e.g. .d
The actual paradigm is that you simply call logging.getLogger('foo.mynewlevel') or, if you have a logger object for foo level, you can do logger.getChild('mynewlevel'). The reason that logging doesn't provide those methods is because they are stateful and break things in a threaded context.
Have you seen all the built-in-handlers?

Personally, I ran into this with stuff logged from downloads--since it's threaded, you can't use push_level and pop_level, and writes can be interleaved (though that's rare).

Edit: Not to mention, there's not a single actual use of push_level and pop_level (yet)

[Comment created by Pidgeot: 2015-05-08] IMO, the standard logging module has a few "problems":

Spelling out the error levels gets too verbose, and that decreases the probability that it will actually be used (every character counts here, IMO, because it's the kind of thing people tend to forget).
There is no way to temporarily change the logging level without remembering the previous level at the point you want to change it from. This requires more code to be written.
No built-in way to access the log messages from within the program.

All of those could be handled by wrapping the logging module, but I really don't see the point - a custom logging module from scratch is going to be far simpler to write.

[Comment created by jmorgensen: 2015-05-08] ~~As of right now, the only references to errorlog are~~

Oops. I meant... there are only five remaining instances of print:

df.py
lnp.py (2 instances)
settings.py
images.py

But hmmmm. Why roll our own logging module in the first place instead of using logging in the standard library?

[Comment created by Pidgeot: 2015-04-22] Provide stack traces to log where appropriate; see #70

→ <<cset 7c55b3b2be18>>

[Comment created by Pidgeot: 2015-04-22] Add logging to graphics; see #70

→ <<cset 3e40b85fe226>>

[Comment created by Pidgeot: 2015-04-22] Add logging to df, baselines, settings; see #70. Fixed baselines bug.

In short, vanilla data/speech/* files were not being removed correctly. This only affects mods. However simplifying anything when baselines caused an error.

→ <<cset 48411530ecaf>>

[Comment created by Pidgeot: 2015-04-22] Add logging to mods; see #70

That's the last file in /core - someone else can do /tkgui!

→ <<cset 657547df6076>>

[Comment created by Pidgeot: 2015-04-22] Add logging to helpers, json_config, core/launcher; see #70

→ <<cset 94973e6b49fa>>

[Comment created by Pidgeot: 2015-04-22] Add logging to dfraw, embarks, hacks; see #70.

Also I forgot to import log in df, so here it is.

→ <<cset 0ea8f278428f>>

[Comment created by Pidgeot: 2015-04-22] Add logging to colors, keybinds, legends; see #70

→ <<cset 763447d1970b>>

[Comment created by PeridexisErrant: 2015-04-13] The following includes Pidgeot's changes above, my evening of changes (see pull request #40), and a few files I looked through and decided didn't need anything more. core/utilties.py and core/manifests.py assume that pull request #36 will be merged.

TODO:
tkgui/advanced.py
tkgui/binding.py
tkgui/child_windows.py
tkgui/controls.py
tkgui/dfhack.py
tkgui/graphics.py
tkgui/layout.py
tkgui/mods.py
tkgui/options.py
tkgui/tab.py
tkgui/tkgui.py
tkgui/utilities.py

DONE:
core/*.py    (list of filenames omitted for brevity)

[Comment created by Pidgeot: 2015-04-11] In the interest of keeping a measure of progress on this, I'm going to list all the files below that have not been explicitly visited for this issue (even if they already have some logging).

PeridexisErrant (and any other contributors), please reference this issue when committing changes on this (put "(see #70)" in the commit message). I will edit this post to keep the list up-to-date.

It is entirely possible, if not likely, that some of the files listed below are not really going to benefit from further logging. If you find such a file, throw a comment here. (I suspect this applies to pretty much all of the tkgui module, but it's here for completeness.)

I will also try to add any new files here, unless they are already using logging when they are added, or it is immediately clear that logging does not serve any purpose for that file.

#!text

TODO:
tkgui/advanced.py
tkgui/binding.py
tkgui/child_windows.py
tkgui/controls.py
tkgui/dfhack.py
tkgui/graphics.py
tkgui/layout.py
tkgui/mods.py
tkgui/options.py
tkgui/tab.py
tkgui/tkgui.py
tkgui/utilities.py

DONE:
core/baselines.py
core/colors.py
core/df.py
core/dfraw.py
core/download.py
core/embarks.py
core/errorlog.py
core/graphics.py
core/hacks.py
core/helpers.py
core/json_config.py
core/keybinds.py
core/launcher.py
core/legends_processor.py
core/lnp.py
core/log.py
core/mods.py
core/paths.py
core/rawlint.py
core/settings.py
core/update.py
core/utilities.py

[Comment created by Pidgeot: 2015-04-12] Rewrite update system (fixes #75, includes logging, see #70) Support for DFFD, regex-based extraction, and general JSON extraction of info

→ <<cset 3453fa181955>>

[Comment created by Pidgeot: 2015-04-12] Add logging to core/download.py (see #70)

→ <<cset 25be391e8216>>

[Comment created by Pidgeot: 2015-03-12] Support prefixes in log class (see #70) + minor refactoring

→ <>

[Comment created by Pidgeot: 2015-03-12] I will soon be adding methods push_prefix and pop_prefix (in the next push). These add prefixes to all outgoing log messages. Prefixes are added in the order they were pushed. Example:

#!python
>>> import log
>>> log.push_prefix('foo')
>>> log.push_prefix('bar')
>>> log.w('test')
WARNING: foo: bar: test

For a better context of the intended use, see this (simplified) snippet from the linter I'm working on (#73):

#!python
def check_folder(path):
    log.push_prefix('RawLint')
    files = [os.path.join(d[0], f) for f in d[2] for d in os.walk(path)]
    for f in files:
            log.push_prefix(f)
            check_file(f)
            log.pop_prefix()
    log.pop_prefix()

This means any log messages logged from check_file will include the file name automatically (as a prefix).

[Comment created by Pidgeot: 2015-03-05] I'll write some notes on my thoughts in creating the logging module - this should help clarify how I see this being used:

There are five logging levels defined - ordered from lowest to highest level, they are VERBOSE, DEBUG, INFO, WARNING and ERROR. These levels are taken from Android, since I felt their ordering made sense.

WARNING and ERROR are hopefully obvious. I imagine these are mainly going to be useful when handling exceptions. INFO is intended for informational messages such as when a high-level operation (e.g. merging a mod) starts. DEBUG is for, well, debug messages. This works well to e.g. indicate checkpoints in the code so you can see when you hit certain parts, and can follow the control flow. VERBOSE is for when you're going absolutely crazy and logging every single detail. I expect this won't be used much, if at all, but it's there if you need to really trace through a method.

In general, the way to use the module will be something like this:

#!python
import Log
Log.d("Debug message goes here")
Log.e("Error message goes here, stack trace will also be added", stack=True)

that is, the methods exported directly on the module should normally be used instead of instantiating a new Log class - but if you want to make a temporary Log instance, you can do so. The default log outputs to stderr (Log.output_err), but can be redirected to stdout (set Log.output_out), or the logged lines can be extracted on demand for whatever reason you want (using Log.get_lines()).

As the example above illustrated, you can get a stack trace printed to the log by supplying a keyword argument stack set to True (or any other value that makes an if statement true).

By default, only WARNING and higher will be logged, but I have added a command-line parameter to switch it to debug on start. Attempts to log messages at a lower level than the current level will be ignored; they won't be output. The current level can be changed permanently using set_level, or changed temporarily using push_level and pop_level - the latter can be useful during development or testing of new features.

Over time, I expect this will replace the current errorlog module, but it'll take a while to get to that stage.

Pidgeot / python-lnp

Use new logging module throughout PyLNP #93