Closed aaron-mcdaid-zalando closed 6 years ago
In this PR, I've disabled the functionality to check if the variances are too different from each other. I'm not sure anybody uses this, and it creates a lot of log messages for some datasets
If somebody find these messages useful, keep in mind that there is an alternative: to change a log level for such messages to DEBUG or TRACE and set a limit to INFO in the host component (one which uses a lib)..I'm not completely sure how to do a later part in Python, but I believe there must be a way.
I have to ask a stupid question. I hear a lot about logging around Zalando, but very little clarity. (I guess I just need to ask much tougher questions!). Does scalyr care about DEBUG/INFO/TRACE? I mean, will we save money if the messages we send to scalyr have a different level?
My guess is that it won't save anything. My guess is that, instead, we need to stop sending info to scalyr in the first place?
... and set a limit to INFO in the host component ...
I guess this is the interesting bit. Are you saying that, in Python, we should configure the logger such that certain messages are simply not printed in the first place? Configure the logger
object in Python to ignore all DEBUG
messages for example?
@aaron-mcdaid-zalando scalyr doesn't care about level. neither scalyr client and logging system in AWS instances care. Decision is being made in application code which messages to log and which not, according to logging level configs. It is not solely about cost cut, but also about usability of those logs. I've sent you a PM how we do it.
Closing this for now, unless somebody else is still interested? Users of expan
can use the following code to suppress most logging from expan:
for logger_name_to_suppress in [
'expan.core.early_stopping',
'expan.core.experiment',
'expan.core.statistics',
]:
logging.getLogger(logger_name_to_suppress).setLevel(logging.ERROR)
# Only 'ERROR' and 'CRITICAL' survive this
oh, it hasn't been merged -- great!
Indeed. This is somewhat obsoleted by other logging changes we've made
These messages seem to be the most repetitive. Also, they don't seem very useful to me
So this PR proposes to remove them. If they are valuable, we could consider making them optional
Also, even if somebody does want to record the number of NaNs, the current message is kinda of useless and is not concise. I would prefer to log a simple, concise, report of the number of NaNs is each column