Unify logging and tool output in AMI

remkop commented 4 years ago

From the Slack chat:

@Remko Popma I need to get a Logging strategy for AMI. It's currently a mess. Most of the debugging is log4j with TRACE->ERROR all used. You have introduced -v/vv in picocli and I've used this fairly simplistically: if (verbosity().length >= 2) System.out.println(+"readTerms"); 12:31 or to mimic log4j if (verbosity().length >= 2) System.out.println(this.getClass()+";" + LocalDate.now()+"; "enter readTerms"); 12:33 (I find it useful to have the class for finding the debug statements). Since I want to try to clean some of this up , do you have a recommended strategy? in particular is it a good idea to combine this with log4j or do something completely separate. 12:36 I also have a problem communicating between the Tool class that "calls" the operations, and communication to legacy libraries. log4j is added to every class using a static variable: public static final Logger LOG = Logger.getLogger(DictionaryCreationTool.class); static { LOG.setLevel(Level.DEBUG); }

12:38 For the time being I'm using the verbosity() with sysout but it feels there could be more... 12:38 Many thanks. 12:43 Also can there be different levels of verbosity for different subcommands, e.g. ami -p foo makeproject pdfbox -v image -vv 12:44 This would have no debug for makeproject , level 1 for pdfbox and level 2 for image`

Peter Murray-Rust 1:01 AM As an example I have a class HTMLFactory which creates HTMLElements and I want to debug it. It's completely separate from FooTool so I'm reluctant to add implements Verbosity because that means potentially altering all libraries. I could allow verbosity() to set the Level in the library classes but again it doesn't feel quite right.

Remko Popma 2:37 PM I am working on a growing command line toolset at work and I faced the same issue with logging and user-facing messages. 2:38 One strategy (which I no longer use) is to completely separate the user-facing messages from the logging; the thinking being that they serve different purposes and have a different audience. In practice however, to me this meant coding the same message twice: once to display it to the user and once to make sure it is also captured in the log file for troubleshooting purposes. I did not like this duplication. (edited)

Remko Popma 2:45 PM So now I am trying something else. 2:47 What I am currently doing is I removed all explicit use of System.out.println and System.err.println from the code and instead use the logging library (Log4j2 in my case) everywhere.

Remko Popma 2:54 PM In the log4j configuration, I have 2 appenders: a log file and the console. Everything (TRACE level and up) is logged to the file, prefixed by the log level, timestamp and the logging class. The console appender shows the user-facing messages. By default, only WARN, ERROR and FATAL log messages are logged to the console - and these are not prefixed with the log level, timestamp and the logging class - only the message is shown. If the user specifies one -v option (--verbose), then the config is changed such that INFO level messages are also shown on the console. -vv includes DEBUG level and -vvv includes TRACE level messages. (edited)

Remko Popma 3:16 PM The picocli project has two examples (1 and 2) of how to code this; but I can help if you want to try this. (edited) 3:17 It may need some extra work if we want to cleanly separate System.out for command output from System.err for diagnostics; this becomes important when we want to pipe command output to other tools.

Remko Popma 5:33 PM Should I assume that we want to pipe output of tools to other tools? (So strict segregation between STDOUT (for data) and STDERR (for diagnostics)?)

Peter Murray-Rust 5:34 PM The user needs the following: the core data from a Tool. For example running ami search they want the results/ directories. These (probably) should not be under logging control as we all want the same output for scientific purposes. 5:34 At present I don't use stderr and stdout at all as I don't expect users to use them in this way 5:35 (This may change). The present intern users are much more competent than (say) hack days I did 3+ years ago. 5:37 But piping the output of search into bash (sed, awk ) etc only works for geeks. For some users (not the current group) commandlines are terrifying. 5:38 all essential data is communicated through the local filestore.

Remko Popma 5:58 PM Regarding controlling output with --verbose: I suggest we make this option an inherited option (with scope = INHERIT) to make it available on all commands/subcommands. We then install an execution strategy that programmatically modifies the log4j configuration based on the verbosity before running the actual command.

remkop commented 4 years ago

I made the following changes:

migrate to Log4j2 API and configuration
new configuration in src\main\resources\log4j2.xml:
- log TRACE-level messages with all details to logs/ami.log file (overwrite on each command)
- log WARN-level messages to console (without timestamp, level or logger name)
replace AbstractAMITool.addLoggingLevel with plain logging and showstopperEncountered flag
replace AbstractAMIDictTool.addLoggingLevel with plain logging and showstopperEncountered flag
replace calls to DebugPrint.*Println with plain logging with "DebugPrint" Marker. We can configure this later to be routed to the console appender or something.
replace calls to AbstractAMITool.amiDebug with plain logging with "amiDebug" Marker. We can configure this later to be routed to the console appender or something.

remkop commented 4 years ago

The log4j2.xml configuration by itself does 99% of the work: all WARN and ERROR level messages are printed to the console by default. (INFO, DEBUG and TRACE level messages only go to the log file).

The user has some control via the --verbose and --log4j options in ami and amidict. We use a custom picocli execution strategy to update the logging configuration before executing the command. If the user specifies -v, then INFO level messages are also shown on the console. If the user specified -vv, then DEBUG level messages are shown on the console, etc.

See AMI.LoggingOptions#reconfigureLogging for details.

Note that this reconfiguration is done by setting the log level of the Console Appender in the log4j configuration. We no longer use Logger.setLevel to change the log level of individual loggers.

remkop commented 4 years ago

One issue remains to complete this topic: many tools print the values of the @Option-annotated fields in the parseSpecifics method, for troubleshooting purposes.

The majority of the tools use System.out.println for this. Peter and I agreed in a Skype call earlier that it is better to use the logging library for this so that the values are also captured in the log file instead of only on the console.

My question is, what log level to use for this: WARN (visible by default) or INFO (only visible if users specify --verbose on the console)? My intuition is to be quiet by default, so I would use INFO, but perhaps others have a different view?

petermr commented 4 years ago

I think quiet by default because when we have trouble we can switch the -v on. It's also useful to keep test output brief.

And many thanks to @remkop for managing this.

remkop commented 4 years ago

Closing, I believe our logging strategy is working well.

I created https://github.com/petermr/ami3/issues/54 as a follow-up item.

petermr / ami3

Unify logging and tool output in AMI #46