r-three / common-pile

Repo to hold code and track issues for the collection of permissively licensed data
MIT License
22 stars 6 forks source link

Create a shared logging setup. #57

Closed blester125 closed 7 months ago

blester125 commented 7 months ago

The main approach is to call licensed_pile.logs.configure_logging in the main function for your script. The default logger name is licensed-pile. Then you can use licensed_pile.get_logger to get the logger (they are singletons based on name). You can also use logging.getLogger("licensed-pile"). Then you can call things like .info or .debug.

For parallel processing, the configure call should be at the module level and the logger name should be dolma.ParallelProcessorName. The other option is to override cls.get_logger to have more control over the logger.