CompEvol / beast2

Bayesian Evolutionary Analysis by Sampling Trees
www.beast2.org
GNU Lesser General Public License v2.1
237 stars 83 forks source link

stderr is not empty for a clean run #1032

Open J-Moravec opened 2 years ago

J-Moravec commented 2 years ago

Stderr should be empty for a run, where no error was encountered. Currently, some extra information is printed to stderr. These are not warnings, these are not errors, just messages informing the user. This pollution of stderr makes it harder to programmatically reason about stderr.

For example, running BEAST on testHKY.xml and redirecting both stdin and stderr to different files:

beast testHKY.xml > stdin.txt 2> stderr.txt

the content of stderr.txt is:

File: testHKY.xml seed: 1656890616100 threads: 1
Loading package Babel v0.3.2
Loading package phylonco v0.0.5
Loading package MM v1.1.1
Loading package SA v2.0.2
Loading package BEASTLabs v1.9.7
Loading package BEAST v2.6.6
Loading package BEAST v2.6.6
  Ignoring ambiguities in tree likelihood.
  Ignoring character uncertainty in tree likelihood.
  With 69 unique site patterns.
  Using rescaling scheme : dynamic
Please wait while BEAST takes 50000 pre-burnin samples
End likelihood: -1821.8949603639173

This is not stderr and fits squarely into the stdin produced by BEAST, given that the output is for human consumption and not something that would be further piped:


                        BEAST v2.6.6, 2002-2021
             Bayesian Evolutionary Analysis Sampling Trees
                       Designed and developed by
 Remco Bouckaert, Alexei J. Drummond, Andrew Rambaut & Marc A. Suchard
  ...                                  
  [shortened]
  ...

Random number seed: 1656890616100

human: 768 4
chimp: 768 4
bonobo: 768 4
gorilla: 768 4
orangutan: 768 4
siamang: 768 4
Alignment(alignment)
  6 taxa
  768 sites
  69 patterns

Starting frequencies: [0.33909429008672115, 0.24655436447178358, 0.1531393568179342, 0.26121198862355866]
  Using BEAGLE version: 3.2.0 (PRE-RELEASE) resource 0: CPU
    with instance flags:  PRECISION_DOUBLE COMPUTATION_SYNCH EIGEN_REAL SCALING_MANUAL SCALERS_RAW VECTOR_SSE THREADING_NONE PROCESSOR_CPU FRAMEWORK_CPU
...

I have also noticed that, when running some unit-tests, an extra output is printed to stdin, specifically the information about alignment. Ideally, unit-tests should be completely silent if they are working properly, so if there is an agreement, I will try to hunt them down and try to fix them, by adding a silent option.

rbouckaert commented 2 years ago

Just to clarify: BEAST classes should not use stdout and stderr directly, but should produce output through the Log class, which has 5 PrintStreams: error, info, warning, debug, trace. By default, debug and trace are suppressed, info is output through stdout and error and warning through stderr.

What gets shown is determined through the Log.level setting. For example, running BEAST with -loglevel info should suppress all these warning messages (as well as debug and trace messages). Perhaps running beast -loglevel info testHKY.xml > stdin.txt 2> stderr.txt is already what you are looking for what running BEAST is concerned?

Maybe a "silent" option for unit tests could be implemented by setting the log-level using Log.setLevel() for all unit tests.

J-Moravec commented 2 years ago

Thanks Remco, I am looking at the -loglevel variable and testing various outputs.

Running beast -loglevel <level> testHKY.xml > stdin.txt

Click to unroll For `error`, I get: ``` Please wait while BEAST takes 50000 pre-burnin samples End likelihood: -1820.8701976801124 ``` For `warning`, I get: ``` File: testHKY.xml seed: 1656903970758 threads: 1 Loading package Babel v0.3.2 Loading package phylonco v0.0.5 Loading package MM v1.1.1 Loading package SA v2.0.2 Loading package BEASTLabs v1.9.7 Loading package BEAST v2.6.6 Loading package BEAST v2.6.6 Ignoring ambiguities in tree likelihood. Ignoring character uncertainty in tree likelihood. With 69 unique site patterns. Using rescaling scheme : dynamic Please wait while BEAST takes 50000 pre-burnin samples End likelihood: -1817.7540390167062 ``` The exact same output is with `info`, `debug` and `trace`.

None of these log levels fully eliminate the stderr.

On top of it, sometimes what is an error ends up in stdout.

Click to unroll When I made a typo and wrote: ``` beast testHKY.xml -loglevel info > stdout.txt ``` The `stderr` was empty, but the first line of `stdout.txt` was: ``` Unrecognized argument: beast2/examples/testHKY.xml ``` This is an error and should be in `stderr`. The error is thrown correctly when it is encountered, so it must be caught and not re-thrown elsewhere. https://github.com/CompEvol/beast2/blob/521ca914a7d1a2103180729a989766b27dc42cca/src/beast/pkgmgmt/Arguments.java#L474

My conclusion is that some parts of BEAST do not correctly label/assign messages into the correct (internal) stream. Such as the loading package message:

https://github.com/CompEvol/beast2/blob/521ca914a7d1a2103180729a989766b27dc42cca/src/beast/base/parser/ClassToPackageMap.java#L79

Loading package is a normal function of BEAST it should IMHO not be put into warning stream unless something unexpected has happened, such as when a file couldn't get opened:

https://github.com/CompEvol/beast2/blob/521ca914a7d1a2103180729a989766b27dc42cca/src/beast/base/parser/ClassToPackageMap.java#L120

Again, the idea is not to suppress the output (which -loglevel does), but to make sure it goes into the correct stream. After all, the whole reason why I am raising the issue is that warnings and errors are important and should not be confused by everything is running correctly messages. What is even worse, some error messages are not even in the error stream, but stdout.

I will check the Log.setLevel() for unit-tests.

rbouckaert commented 2 years ago

You are right that the philosophy behind the use of the Log class is not consistently implemented, and indeed it would be good to review the code for this. If you would be willing to have a go at this that will be really great!

Stdout and stderr should certainly not be used, since these messages may not end up with the user (I think BEAST running in a console or BEAUti's message menu may capture only some of the messages).

PS You are aware of the v2.7 code quake (commits 521ca914a7d1a2103180729a989766b27dc42cca and later + #1030), which refactors much of the code. There is still a v2.6 branch, but updates of the master branch are preferred.