Closed galamit86 closed 3 years ago
@dkapitan
Trying to implement a decorator logger, I am having trouble understanding what should happen in the decorator, and what in the function itself. More specifically, where do I instantiate a logger
object?
Currently, with the basic implementation, I do this:
## statline_bq/utils.py
import logging
# provide logging configuration
LOGGING_CONF_FILE = Path(__file__).parent / "logging_conf.toml"
logging.config.dictConfig(toml.load(LOGGING_CONF_FILE))
# create logger object
logger = logging.getLogger(__name__)
# define some function, and use logger to create the Log Record
def some_func():
# do some things
do_something()
do_something_else()
# log some things
logger.info("This is an INFO log message")
My first instinct, was to write this decorator:
## statline_bq/log.py
from pathlib import Path
import functools
import logging
import logging.config
import toml
LOGGING_CONF_FILE = Path(__file__).parent / "logging_conf.toml"
logging.config.dictConfig(toml.load(LOGGING_CONF_FILE))
def logdec(func):
@functools.wraps(func)
def log(*args, **kwargs):
logger = logging.getLogger(func.__module__) #should provide the same as __name__ in the file itself
result = func(*args, **kwargs)
return result
return log
Naively hoping that the by replacing the last line of some_func()
from logger.info("MESSAGE")
to logging.info("MESSAGE")
, and decorating it, so it ends up looking like this:
## statline_bq/utils.py
import logging
from statline_bq.log import logdec
# No logger or configuration defined here now
@logdec
def some_func():
# do some things
do_something()
do_something_else()
# log some things
logging.info("This is an INFO log message")
, it would use the logger
defined in the decorator. But that is not true, as logging.info()
simply uses the root
logger, and the logger
defined in the decorator is not used.
The examples I can find online are always about logging things from "outside" of the function - either because it finished concluded running - like the returned value or a raised exception, or value that are constant to the function - like its name or module. But it seems to me we are looking for something else: we want to log things that are happening inside the function, during normal operation.
I am still working on it, so my problem definition might be off still. But if you have some thoughts or reference to this, let me know.
@dkapitan I thought I saw you replied to this - was that somewhere else?
@galamit86 I have sent you an email with an eBook, that's probably what you saw.
@dkapitan
See logging
branch for latest implementation:
log.py
is where the decorator is created
logging_conf.toml
is where all logging configurations are provided - we can add and define formatters, handlers and loggers there.
logger
: root
, with 2 handlers:file
- a RotatingFileHandler
set to WARNING
level (although as it currently stands, we only use INFO
or ERROR
in practice)console
- a StreamHandler
set to INFO
levelutils.py
functions are currently decoratedIf you have any input, let me know!
@galamit86 looks really good, clean code and very generic way of implementation. I have learned about decorators, too 😄
One remark / suggestion: do we want to add a timestamp to the log message or do we assume that the logging environment (Prefect) has that? Proposal: add parameter with_timestamp
which defaults to false
. If true, prepend logstring with time.now()
.
@dkapitan Thanks! I learned a quite a lot about decorators and logging - though it feels like scratching the surface :)
prefect
definitely formats its log messages with a timestamp, but so do we!
See this row, which defines the format of a our log message (it's a pretty standard implementation - I saw it in most use cases I looked at).
A couple of example lines look like this:
2021-05-07 13:35:10,157 - statline_bq.utils - INFO: Starting function main with arguments: (id='83583NED', source='cbs', third_party=False, config=<Box: {'gcp': {'prod': {'cbs_dl': {'project_id': 'dataverbinders-cbs-dl', 'bucket': 'dataverbinders-cbs-dl', 'location': 'EU'}, 'external_dl': {'project_id': 'dataverbinders-external-dl', 'bucket': 'dataverbinders-external-dl', 'location': 'EU'}, 'dwh': {'project_id': 'dataverbinders-open-dwh', 'bucket': 'dataverbinders-open-dwh', 'location': 'EU'}}, 'test': {'project_id': 'dataverbinders-test', 'bucket': 'dataverbinders-test', 'location': 'EU'}, 'dev': {'project_id': 'dataverbinders-dev', 'bucket': 'dataverbinders-dev', 'location': 'EU'}}, 'paths': {'cbs': 'cbs', 'mlz': 'mlz', 'rivm': 'rivm', 'politie': 'politie', 'azw': 'azw', 'iv3': 'iv3', 'mkb': 'mkb', 'jm': 'jm', 'interreg': 'interreg', 'agb': 'agb', 'vektis_open_data': 'vektis_open_data', 'bag': 'bag'}}>, gcp_env='dev', force=False)
2021-05-07 13:35:10,159 - statline_bq.utils - INFO: Starting function check_gcp_env with arguments: ('dev')
2021-05-07 13:35:10,159 - statline_bq.utils - INFO: Succesfully finished function check_gcp_env
2021-05-07 13:35:10,159 - statline_bq.utils - INFO: Starting function check_v4 with arguments: (id='83583NED', third_party=False)
2021-05-07 13:35:10,251 - statline_bq.utils - INFO: Succesfully finished function check_v4
2021-05-07 13:35:10,251 - statline_bq.utils - INFO: Starting function cbsodata_to_gbq with arguments: (id='83583NED', odata_version='v3', third_party=False, source='cbs', config=<Box: {'gcp': {'prod': {'cbs_dl': {'project_id': 'dataverbinders-cbs-dl', 'bucket': 'dataverbinders-cbs-dl', 'location': 'EU'}, 'external_dl': {'project_id': 'dataverbinders-external-dl', 'bucket': 'dataverbinders-external-dl', 'location': 'EU'}, 'dwh': {'project_id': 'dataverbinders-open-dwh', 'bucket': 'dataverbinders-open-dwh', 'location': 'EU'}}, 'test': {'project_id': 'dataverbinders-test', 'bucket': 'dataverbinders-test', 'location': 'EU'}, 'dev': {'project_id': 'dataverbinders-dev', 'bucket': 'dataverbinders-dev', 'location': 'EU'}}, 'paths': {'cbs': 'cbs', 'mlz': 'mlz', 'rivm': 'rivm', 'politie': 'politie', 'azw': 'azw', 'iv3': 'iv3', 'mkb': 'mkb', 'jm': 'jm', 'interreg': 'interreg', 'agb': 'agb', 'vektis_open_data': 'vektis_open_data', 'bag': 'bag'}}>, gcp_env='dev', force=False)
2021-05-07 13:35:10,251 - statline_bq.utils - INFO: Starting function set_gcp with arguments: (<Box: {'gcp': {'prod': {'cbs_dl': {'project_id': 'dataverbinders-cbs-dl', 'bucket': 'dataverbinders-cbs-dl', 'location': 'EU'}, 'external_dl': {'project_id': 'dataverbinders-external-dl', 'bucket': 'dataverbinders-external-dl', 'location': 'EU'}, 'dwh': {'project_id': 'dataverbinders-open-dwh', 'bucket': 'dataverbinders-open-dwh', 'location': 'EU'}}, 'test': {'project_id': 'dataverbinders-test', 'bucket': 'dataverbinders-test', 'location': 'EU'}, 'dev': {'project_id': 'dataverbinders-dev', 'bucket': 'dataverbinders-dev', 'location': 'EU'}}, 'paths': {'cbs': 'cbs', 'mlz': 'mlz', 'rivm': 'rivm', 'politie': 'politie', 'azw': 'azw', 'iv3': 'iv3', 'mkb': 'mkb', 'jm': 'jm', 'interreg': 'interreg', 'agb': 'agb', 'vektis_open_data': 'vektis_open_data', 'bag': 'bag'}}>, 'dev', 'cbs')
2021-05-07 13:35:10,252 - statline_bq.utils - INFO: Succesfully finished function set_gcp
2021-05-07 13:35:10,252 - statline_bq.utils - INFO: Starting function get_urls with arguments: (id='83583NED', odata_version='v3', third_party=False)
2021-05-07 13:35:10,326 - statline_bq.utils - INFO: Succesfully finished function get_urls
2021-05-07 13:35:10,326 - statline_bq.utils - INFO: Starting function get_metadata_cbs with arguments: (id='83583NED', third_party=False, odata_version='v3')
2021-05-07 13:35:10,398 - statline_bq.utils - INFO: Succesfully finished function get_metadata_cbs
2021-05-07 13:35:10,398 - statline_bq.utils - INFO: Starting function get_metadata_gcp with arguments: (id='83583NED', source='cbs', odata_version='v3', gcp=<Box: {'project_id': 'dataverbinders-dev', 'bucket': 'dataverbinders-dev', 'location': 'EU'}>, credentials=None)
2021-05-07 13:35:12,364 - statline_bq.utils - INFO: Starting function get_latest_folder with arguments: ('cbs/v3/83583NED', <Box: {'project_id': 'dataverbinders-dev', 'bucket': 'dataverbinders-dev', 'location': 'EU'}>)
2021-05-07 13:35:14,707 - statline_bq.utils - INFO: Succesfully finished function get_latest_folder
2021-05-07 13:35:14,835 - statline_bq.utils - INFO: Succesfully finished function get_metadata_gcp
sweet!
On Sat, May 8, 2021 at 12:06 PM Amit Gal @.***> wrote:
@dkapitan https://github.com/dkapitan Thanks! I learned a quite a lot about decorators and logging - though it feels like scratching the surface :)
prefect definitely formats its log messages with a timestamp, but so do we! See this row https://github.com/dataverbinders/statline-bq/blob/aeb69ddf51345f453bd23e2e8c1f7d57bc1cf8fc/statline_bq/logging_conf.toml#L6, which defines the format of a our log message (it's a pretty standard implementation - I saw it in most use cases I looked at).
A couple of example lines look like this:
2021-05-07 13:35:10,157 - statline_bq.utils - INFO: Starting function main with arguments: (id='83583NED', source='cbs', third_party=False, config=<Box: {'gcp': {'prod': {'cbs_dl': {'project_id': 'dataverbinders-cbs-dl', 'bucket': 'dataverbinders-cbs-dl', 'location': 'EU'}, 'external_dl': {'project_id': 'dataverbinders-external-dl', 'bucket': 'dataverbinders-external-dl', 'location': 'EU'}, 'dwh': {'project_id': 'dataverbinders-open-dwh', 'bucket': 'dataverbinders-open-dwh', 'location': 'EU'}}, 'test': {'project_id': 'dataverbinders-test', 'bucket': 'dataverbinders-test', 'location': 'EU'}, 'dev': {'project_id': 'dataverbinders-dev', 'bucket': 'dataverbinders-dev', 'location': 'EU'}}, 'paths': {'cbs': 'cbs', 'mlz': 'mlz', 'rivm': 'rivm', 'politie': 'politie', 'azw': 'azw', 'iv3': 'iv3', 'mkb': 'mkb', 'jm': 'jm', 'interreg': 'interreg', 'agb': 'agb', 'vektis_open_data': 'vektis_open_data', 'bag': 'bag'}}>, gcp_env='dev', force=False) 2021-05-07 13:35:10,159 - statline_bq.utils - INFO: Starting function check_gcp_env with arguments: ('dev') 2021-05-07 13:35:10,159 - statline_bq.utils - INFO: Succesfully finished function check_gcp_env 2021-05-07 13:35:10,159 - statline_bq.utils - INFO: Starting function check_v4 with arguments: (id='83583NED', third_party=False) 2021-05-07 13:35:10,251 - statline_bq.utils - INFO: Succesfully finished function check_v4 2021-05-07 13:35:10,251 - statline_bq.utils - INFO: Starting function cbsodata_to_gbq with arguments: (id='83583NED', odata_version='v3', third_party=False, source='cbs', config=<Box: {'gcp': {'prod': {'cbs_dl': {'project_id': 'dataverbinders-cbs-dl', 'bucket': 'dataverbinders-cbs-dl', 'location': 'EU'}, 'external_dl': {'project_id': 'dataverbinders-external-dl', 'bucket': 'dataverbinders-external-dl', 'location': 'EU'}, 'dwh': {'project_id': 'dataverbinders-open-dwh', 'bucket': 'dataverbinders-open-dwh', 'location': 'EU'}}, 'test': {'project_id': 'dataverbinders-test', 'bucket': 'dataverbinders-test', 'location': 'EU'}, 'dev': {'project_id': 'dataverbinders-dev', 'bucket': 'dataverbinders-dev', 'location': 'EU'}}, 'paths': {'cbs': 'cbs', 'mlz': 'mlz', 'rivm': 'rivm', 'politie': 'politie', 'azw': 'azw', 'iv3': 'iv3', 'mkb': 'mkb', 'jm': 'jm', 'interreg': 'interreg', 'agb': 'agb', 'vektis_open_data': 'vektis_open_data', 'bag': 'bag'}}>, gcp_env='dev', force=False) 2021-05-07 13:35:10,251 - statline_bq.utils - INFO: Starting function set_gcp with arguments: (<Box: {'gcp': {'prod': {'cbs_dl': {'project_id': 'dataverbinders-cbs-dl', 'bucket': 'dataverbinders-cbs-dl', 'location': 'EU'}, 'external_dl': {'project_id': 'dataverbinders-external-dl', 'bucket': 'dataverbinders-external-dl', 'location': 'EU'}, 'dwh': {'project_id': 'dataverbinders-open-dwh', 'bucket': 'dataverbinders-open-dwh', 'location': 'EU'}}, 'test': {'project_id': 'dataverbinders-test', 'bucket': 'dataverbinders-test', 'location': 'EU'}, 'dev': {'project_id': 'dataverbinders-dev', 'bucket': 'dataverbinders-dev', 'location': 'EU'}}, 'paths': {'cbs': 'cbs', 'mlz': 'mlz', 'rivm': 'rivm', 'politie': 'politie', 'azw': 'azw', 'iv3': 'iv3', 'mkb': 'mkb', 'jm': 'jm', 'interreg': 'interreg', 'agb': 'agb', 'vektis_open_data': 'vektis_open_data', 'bag': 'bag'}}>, 'dev', 'cbs') 2021-05-07 13:35:10,252 - statline_bq.utils - INFO: Succesfully finished function set_gcp 2021-05-07 13:35:10,252 - statline_bq.utils - INFO: Starting function get_urls with arguments: (id='83583NED', odata_version='v3', third_party=False) 2021-05-07 13:35:10,326 - statline_bq.utils - INFO: Succesfully finished function get_urls 2021-05-07 13:35:10,326 - statline_bq.utils - INFO: Starting function get_metadata_cbs with arguments: (id='83583NED', third_party=False, odata_version='v3') 2021-05-07 13:35:10,398 - statline_bq.utils - INFO: Succesfully finished function get_metadata_cbs 2021-05-07 13:35:10,398 - statline_bq.utils - INFO: Starting function get_metadata_gcp with arguments: (id='83583NED', source='cbs', odata_version='v3', gcp=<Box: {'project_id': 'dataverbinders-dev', 'bucket': 'dataverbinders-dev', 'location': 'EU'}>, credentials=None) 2021-05-07 13:35:12,364 - statline_bq.utils - INFO: Starting function get_latest_folder with arguments: ('cbs/v3/83583NED', <Box: {'project_id': 'dataverbinders-dev', 'bucket': 'dataverbinders-dev', 'location': 'EU'}>) 2021-05-07 13:35:14,707 - statline_bq.utils - INFO: Succesfully finished function get_latest_folder 2021-05-07 13:35:14,835 - statline_bq.utils - INFO: Succesfully finished function get_metadata_gcp
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dataverbinders/statline-bq/issues/67#issuecomment-835263230, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA7GYDWDF2MOLFNJYFNPDYTTMUELPANCNFSM43335PIQ .
Implement logging using the standard
logging
module:print
statements tologging.SOME_LEVEL
prefect
, throughnl-open-data