Closed ElenaKhaustova closed 5 days ago
Relaying internal discussion:
@astrojuanlu: Rather than a boolean argument, could this be configured at the logger level instead? @ElenaKhaustova: I think the problem is that users want to silent this particular method rather that disable logging at all.
I’m not sure we should have a custom approach for dealing with logging for dataset loading compared to other log messages.
The amount of logging information during the catalog.load()
does not look extensive and should not be changed IMO. However, I can imagine cases with tens of datasets where this INFO
output will pollute a cell.
I also agree that adding a custom flag just for one method - catalog.load()
is not the best solution.
Based on the above we can consider the following:
import logging
logger_level = logging.getLogger("kedro").level logging.getLogger("kedro").setLevel(logging.WARNING)
for ds in catalog.list(): _ = catalog.load(ds)
logging.getLogger("kedro").setLevel(logger_level)
3. Make the `_logger` property public so that users can modify it and set the desired logging level just for the catalog component rather than globally. In that case they would not need to reset it.
https://github.com/kedro-org/kedro/blob/413dbca46a9cb8e29fc1761564c4e13e2efe4d8e/kedro/io/data_catalog.py#L201
This is a pattern I've never seen. And yet, can't the user do
logging.getLogger("kedro.io.data_catalog").setLevel(logging.WARNING)
? (Haven't tested this, unsure if it works)
@astrojuanlu
Yes, that works and doesn't affect the top-level loggers! So I believe that should be the solution and no changes are required as all of them lead to the same amount of effort from the user side.
Closing the issue.
So I still think we should make this easier for the user:
logging.yml
setting?So I still think we should make this easier for the user:
- Docs at the very least
logging.yml
setting?
Good point, I will extend the section on how to customise logging then. But I would not change the default logging.yml
as having INFO
level seems reasonable at the start.
Completed in #3968
Description
catalog.load()
produces extensive logging information when loading multiple datasets in the same notebook cell. These logger messages annoy users and cannot be disabled with the current implementation.Context
"It’s very convenient to load my dataset when debugging locally. The only annoying thing is the multiple
logger.info
messages I get when I load multiple datasets in the same notebook cell."Possible Implementation
catalog.load()
to disable logging prints for cleaner outputs.catalog.load()
can be reduced.