choderalab / yank

An open, extensible Python framework for GPU-accelerated alchemical free energy calculations.
http://getyank.org
MIT License
179 stars 70 forks source link

Use logging for logging #3

Closed kyleabeauchamp closed 8 years ago

kyleabeauchamp commented 10 years ago

I think we should be able to use logging to control the verbosity level without having to pass around verbose arguments everywhere.

It might be possible to make this work with MPI as well: https://github.com/jrs65/python-mpi-logger

kyleabeauchamp commented 10 years ago

I think what we want is:

import logging
logger = logging.getLogger(__name__)

verbose -> logger.debug() not verbose -> logger.info()

To set the debug statements to print, do this:

import logging
logging.basicConfig(level=0)
kyleabeauchamp commented 10 years ago

Here's some example output using the default formatter. We can change the formatting to fit our needs.

DEBUG:repex.netcdf_io:Storing thermodynamic states in NetCDF file...
DEBUG:repex.netcdf_io:Serializing state 0...
DEBUG:repex.netcdf_io:Serialized state is 772051 B | 753.956 KB | 0.736 MB
DEBUG:repex.netcdf_io:Serializing state 1...
DEBUG:repex.netcdf_io:Serialized state is 772051 B | 753.956 KB | 0.736 MB
DEBUG:repex.netcdf_io:Serializing state 2...
DEBUG:repex.netcdf_io:Serialized state is 772051 B | 753.956 KB | 0.736 MB
DEBUG:repex.netcdf_io:Storing run parameters in NetCDF file...
DEBUG:repex.netcdf_io:Storing option: collision_rate -> 91.0 (type: <type 'float'>)
DEBUG:repex.netcdf_io:Storing option: constraint_tolerance -> 1e-06 (type: <type 'float'>)
DEBUG:repex.netcdf_io:Storing option: timestep -> 2.0 (type: <type 'float'>)
DEBUG:repex.netcdf_io:Storing option: nsteps_per_iteration -> 500 (type: <type 'int'>)
DEBUG:repex.netcdf_io:Storing option: number_of_iterations -> 1 (type: <type 'int'>)
DEBUG:repex.netcdf_io:Storing option: equilibration_timestep -> 1.0 (type: <type 'float'>)
DEBUG:repex.netcdf_io:Storing option: number_of_equilibration_iterations -> 1 (type: <type 'int'>)
DEBUG:repex.netcdf_io:Storing option: title -> Replica-exchange simulation created using ReplicaExchange class of repex.py on Wed Oct 30 15:39:39 2013 (type: <type 'str'>)
DEBUG:repex.netcdf_io:Storing option: minimize -> 1 (type: <type 'bool'>)
DEBUG:repex.netcdf_io:Storing option: replica_mixing_scheme -> swap-all (type: <type 'str'>)
DEBUG:repex.netcdf_io:Storing option: online_analysis -> 0 (type: <type 'bool'>)
DEBUG:repex.netcdf_io:Storing option: show_mixing_statistics -> 1 (type: <type 'bool'>)
kyleabeauchamp commented 10 years ago

The DEBUG level does not print by default--we have to set level=0 for it to show up.

jchodera commented 10 years ago

Great idea! +1

kyleabeauchamp commented 10 years ago

To deal with the MPI rank stuff, we can do something similar to this:

logger = logging.getLogger("test")

def debug0(x):
    if comm.rank == 0:
        logger.debug(x)

logger.debug0 = debug0

and then call debug0() for printing DEBUG level information only from the head node. Obviously this will need to be refined a bit, but I think the basic idea should be fine.

pgrinaway commented 9 years ago

What is the status of this?

andrrizzi commented 9 years ago

I also find it useful to save the log message in a file to be able to go back and check when something goes wrong. Right now using "yank run -s STORE -v > yank.log" doesn't work perfectly because the logging.debug() are printed on stderr while print() uses stdout. The logging module can print on the screen and save the messages in a file at the same time and with different levels. An alternative implementation that takes care of this issue and saves also the result in a file:

import logging

class Yank:
    def __init__():
        ...
        # Configure root logger
        logging.basicConfig(filename='yank.log', level=logging.DEBUG)
        stdout_handler = logging.StreamHandler(sys.stdout) # or sys.stderr if you prefer
        stdout_handler.setLevel(logging.INFO) # the level here is set according to the verbose arg
        logging.getLogger().addHandler(stdout_handler)
        ...

after the root logger has been configured the other loggers inherit the configuration so with the example above

#in alchemy.py
logger = logging.getLogger(__name__)
logger.debug('debug test')
logger.info('info test')

save both the strings "DEBUG:yank.alchemy:debug test" and "INFO:yank.alchemy:info test" in the file yank.log, but only "INFO:yank.alchemy:info test" is printed on stdout.

Maybe once this is implemented all the print() calls could be replaced by logging.info()?

The log is always appended to the existing file so it should probably be deleted at startup or with "yank cleanup".

pgrinaway commented 9 years ago

+1

andrrizzi commented 9 years ago

I have implemented this in my branch since I'm using it for debugging.

Right now the logging configuration is command-based. Only "run" and "prepare" save the log file, unless run is called with --mpi in which case the file is not saved and verbose is set to false.

I'm going to test it a little bit more while I'm using it before submitting a pull request.

jchodera commented 9 years ago

This is great, thanks! Please do submit a PR! It's easier to comment on your modifications that way.

You can mark it as "[WIP]" to denote that you aren't quite ready to merge yet.

andrrizzi commented 8 years ago

I think there's nothing else we need to do about this. Closing.