Logging in tokenizer.py ‘absorbs’ downstream logging

cldf / segments

Unicode Standard tokenization routines and orthography profile segmentation

Apache License 2.0

31 stars 13 forks source link

Logging in tokenizer.py ‘absorbs’ downstream logging #47

Closed lfashby closed 4 years ago

lfashby commented 4 years ago

Importing the segments package leads to logging messages being absorbed by global logging.basicConfig() in tokenizer.py.

For example:

import logging

logging.basicConfig(level="INFO")
logging.info('Logging to console') # Will log to console.

While:

import logging
import segments

logging.basicConfig(level="INFO")
logging.info('Logging to console') # Will NOT log to console.

xrotwang commented 4 years ago

If you want to use segments as library within a package, you may want to import it in function scope:

>>> import logging
>>> logging.basicConfig(level="INFO")
>>> def f():
...     import segments
...     logging.info('Logging to console')
... 
>>> f()
INFO:root:Logging to console

kylebgorman commented 4 years ago

Just wanted to put a +1 on @lfashby's issue---importing a third-party module inside of a function call to avoid an unnecessary side effect is very ugly!

I've created a PR that removes the basicConfig call. As far as I can tell this has no negative consequences for the larger library. But perhaps there's some (untested) desirable side effect to calling basicConfig at module scope in tokenizer.py. If so you can ignore the PR.