Closed jdhoffa closed 3 months ago
This is a rather meta issue that probably belongs somewhere else. @jdhoffa? related issue for specifically this repo is here RMI-PACTA/workflow.transition.monitor#70
See also thread in RMI-PACTA/workflow.transition.monitor#252
Putting my thoughts on the "how" of logging here, since this seems to be the most cohesive thread at the moment.
First: there should not be any notable difference between running something interactively and running it via a script. The function should be the function, and any behavior changes should be brought about by arguments. This includes the need to make things like the foo_handler()
above, which introduces a key split between what is used in dev and PROD.
This means that we need to write our functions to be largely ignorant of their contexts, except what is defined by arguments, or shared namespace (functional programming π ). Therefore, logging should always go to the one place that we know exists: stdout
(or stderr
). Any additional logging apparatus should be handled by the logger not by the calling function.
logger
does this automatically, so I would consider re-writing the code example you provided with something closer to this:
# add appender so that reprex captures the log output. Also visible in the stdout dropdown below
logger::log_appender(logger::appender_stdout, index = 2)
foo <- function(bar) {
# some error condition
if (!is.numeric(bar)) {
logger::log_error("Argument `bar` must be numeric, not {class(bar)}.")
stop("Argument `bar` must be numeric.")
}
# some warning condition
if (!is.integer(bar)) {
logger::log_warn(
"Argument `bar` is recommended to be integer, not {class(bar)}."
)
warning("Argument `bar` should be integer")
}
return(bar)
}
foo("bar")
#> ERROR [2024-03-13 17:23:41] Argument `bar` must be numeric, not character.
#> Error in foo("bar"): Argument `bar` must be numeric.
foo(1.5)
#> WARN [2024-03-13 17:23:41] Argument `bar` is recommended to be integer, not numeric.
#> Warning in foo(1.5): Argument `bar` should be integer
#> [1] 1.5
foo(1L)
#> [1] 1
Created on 2024-03-13 with reprex v2.0.2
Awesome. so that covers logging to stdout
, which is ephemeral. Now we have the question of how to persist those logs appropriately. Fortunately, we have some options here:
stdout
in some way. this could be as simple as tee
, or something more complicated like attaching the environment to an Azure Log Workspace (which we are doing with our Azure containers).stdout
(which may be useful if you're dealing with sensitive logs). The downside here is that you need to handle this setup in the application flow.Thankfully again, logger
makes this easy! Remember the log_appender()
call in the reprex above? That's basically saying "please also write the log message to this destination." So we can consider a variant on the above:
# add appender so that reprex captures the log output
logger::log_appender(logger::appender_stdout, index = 2)
# add file outputs to capture logs
logger::log_appender(logger::appender_file("foo.log"), index = 3)
# and because this is a demo, add one that captures in different level and format
logger::log_appender(logger::appender_file("bar.json"), index = 4)
logger::log_threshold("TRACE", index = 4)
logger::log_layout(logger::layout_json(), index = 4)
foo <- function(bar) {
# Add a trace log
logger::log_trace("Entering foo()")
logger::log_trace("Argument `bar` is {bar}")
# some error condition
if (!is.numeric(bar)) {
logger::log_error("Argument `bar` must be numeric, not {class(bar)}.")
stop("Argument `bar` must be numeric.")
}
# some warning condition
if (!is.integer(bar)) {
logger::log_warn(
"Argument `bar` is recommended to be integer, not {class(bar)}."
)
warning("Argument `bar` should be integer")
}
return(bar)
}
foo("some string")
#> ERROR [2024-03-13 18:43:49] Argument `bar` must be numeric, not character.
#> Error in foo("some string"): Argument `bar` must be numeric.
foo(1.5)
#> WARN [2024-03-13 18:43:49] Argument `bar` is recommended to be integer, not numeric.
#> Warning in foo(1.5): Argument `bar` should be integer
#> [1] 1.5
foo(1L)
#> [1] 1
readLines("foo.log")
#> [1] "ERROR [2024-03-13 18:43:49] Argument `bar` must be numeric, not character."
#> [2] "WARN [2024-03-13 18:43:49] Argument `bar` is recommended to be integer, not numeric."
readLines("bar.json")
#> [1] "{\"time\":\"2024-03-13 18:43:49\",\"level\":\"TRACE\",\"ns\":\"global\",\"ans\":\"global\",\"topenv\":\"R_GlobalEnv\",\"fn\":\"foo\",\"node\":\"MACCM424GV711\",\"arch\":\"arm64\",\"os_name\":\"Darwin\",\"os_release\":\"23.2.0\",\"os_version\":\"Darwin Kernel Version 23.2.0: Wed Nov 15 21:59:33 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T8112\",\"pid\":8395,\"user\":\"aaxthelm\",\"msg\":\"Entering foo()\"}"
#> [2] "{\"time\":\"2024-03-13 18:43:49\",\"level\":\"TRACE\",\"ns\":\"global\",\"ans\":\"global\",\"topenv\":\"R_GlobalEnv\",\"fn\":\"foo\",\"node\":\"MACCM424GV711\",\"arch\":\"arm64\",\"os_name\":\"Darwin\",\"os_release\":\"23.2.0\",\"os_version\":\"Darwin Kernel Version 23.2.0: Wed Nov 15 21:59:33 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T8112\",\"pid\":8395,\"user\":\"aaxthelm\",\"msg\":\"Argument `bar` is some string\"}"
#> [3] "{\"time\":\"2024-03-13 18:43:49\",\"level\":\"ERROR\",\"ns\":\"global\",\"ans\":\"global\",\"topenv\":\"R_GlobalEnv\",\"fn\":\"foo\",\"node\":\"MACCM424GV711\",\"arch\":\"arm64\",\"os_name\":\"Darwin\",\"os_release\":\"23.2.0\",\"os_version\":\"Darwin Kernel Version 23.2.0: Wed Nov 15 21:59:33 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T8112\",\"pid\":8395,\"user\":\"aaxthelm\",\"msg\":\"Argument `bar` must be numeric, not character.\"}"
#> [4] "{\"time\":\"2024-03-13 18:43:49\",\"level\":\"TRACE\",\"ns\":\"global\",\"ans\":\"global\",\"topenv\":\"R_GlobalEnv\",\"fn\":\"foo\",\"node\":\"MACCM424GV711\",\"arch\":\"arm64\",\"os_name\":\"Darwin\",\"os_release\":\"23.2.0\",\"os_version\":\"Darwin Kernel Version 23.2.0: Wed Nov 15 21:59:33 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T8112\",\"pid\":8395,\"user\":\"aaxthelm\",\"msg\":\"Entering foo()\"}"
#> [5] "{\"time\":\"2024-03-13 18:43:49\",\"level\":\"TRACE\",\"ns\":\"global\",\"ans\":\"global\",\"topenv\":\"R_GlobalEnv\",\"fn\":\"foo\",\"node\":\"MACCM424GV711\",\"arch\":\"arm64\",\"os_name\":\"Darwin\",\"os_release\":\"23.2.0\",\"os_version\":\"Darwin Kernel Version 23.2.0: Wed Nov 15 21:59:33 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T8112\",\"pid\":8395,\"user\":\"aaxthelm\",\"msg\":\"Argument `bar` is 1.5\"}"
#> [6] "{\"time\":\"2024-03-13 18:43:49\",\"level\":\"WARN\",\"ns\":\"global\",\"ans\":\"global\",\"topenv\":\"R_GlobalEnv\",\"fn\":\"foo\",\"node\":\"MACCM424GV711\",\"arch\":\"arm64\",\"os_name\":\"Darwin\",\"os_release\":\"23.2.0\",\"os_version\":\"Darwin Kernel Version 23.2.0: Wed Nov 15 21:59:33 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T8112\",\"pid\":8395,\"user\":\"aaxthelm\",\"msg\":\"Argument `bar` is recommended to be integer, not numeric.\"}"
#> [7] "{\"time\":\"2024-03-13 18:43:49\",\"level\":\"TRACE\",\"ns\":\"global\",\"ans\":\"global\",\"topenv\":\"R_GlobalEnv\",\"fn\":\"foo\",\"node\":\"MACCM424GV711\",\"arch\":\"arm64\",\"os_name\":\"Darwin\",\"os_release\":\"23.2.0\",\"os_version\":\"Darwin Kernel Version 23.2.0: Wed Nov 15 21:59:33 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T8112\",\"pid\":8395,\"user\":\"aaxthelm\",\"msg\":\"Entering foo()\"}"
#> [8] "{\"time\":\"2024-03-13 18:43:49\",\"level\":\"TRACE\",\"ns\":\"global\",\"ans\":\"global\",\"topenv\":\"R_GlobalEnv\",\"fn\":\"foo\",\"node\":\"MACCM424GV711\",\"arch\":\"arm64\",\"os_name\":\"Darwin\",\"os_release\":\"23.2.0\",\"os_version\":\"Darwin Kernel Version 23.2.0: Wed Nov 15 21:59:33 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T8112\",\"pid\":8395,\"user\":\"aaxthelm\",\"msg\":\"Argument `bar` is 1\"}"
Created on 2024-03-13 with reprex v2.0.2
Note here that we're not capturing the messages from stop
and warning
, which is why I've been an advocate for using both logger::log_warn
and warning
together, or log_error
and stop
(use logger
for messages, use warning
/stop
to control application flow).
But in any case, this gives us what I think we're looking for: a function implemented once, and the logging behavior controlled external to the function, but consistent. By default, if a user runs this function interactively, they'll get the useful messages on stdout
. If we're planning on putting this into a script, I would consider doing something like the following, which would include the log messages along with the outputs
# run_pacta.R
output_dir <- sys.getEnv("OUTPUT_DIR")
# include logs along with outputs
logger::log_appender(logger::appender_file(file.path(output_dir, "log.txt"), index = 2)
#write traces to separate dir, so they can be removed/cleaned regularly
logger::log_appender(logger::appender_file(file.path(traces_dir, "log.txt"), index = 3)
logger::log_threshold("TRACE", index = 3)
main(write_to = output_dir)
@AlexAxthelm this sounds very cool!!
NIT: In general, it sounds like @cjyetman doesn't want this discussion to happen on this particular repo that he maintains, so maybe we move this discussion to https://github.com/RMI-PACTA/workflow.pacta ?
moved.
@cjyetman @jdhoffa picking up this thread again, I think what we have in pacta.workflow.utils
is a pretty good approximation for what I'd hope for:
Basically, we import log_*
into the called packages, and write some log_info()
statements from the top-level functions there, and lower level ones get log_debug
and log_trace
as appropriate.
I think that adding some log_*
calls in pacta.portfolio.import
, pacta.portfolio.audit
and pacta.portfolio.allocate
would be pretty useful overall.
@jdhoffa @cjyetman vote to close this issue, unless we think we need more discussion?
Happy to close it!
Context
Many people use R (and the functions that we design for PACTA) interactively. R is developed with this in mind, and tidyverse facilitates this very well.
However, we also have several Docker images and workflows that combine functions and data in particular ways to produce outputs. Two of the largest are: https://github.com/RMI-PACTA/workflow.data.preparation and https://github.com/RMI-PACTA/workflow.transition.monitor
In the case of workflows, it is often useful to run through many functions and decide how to handle errors from each of them (and where to log them) on a case by case. For the transition monitor, we may want to write the logs to an
error_messages.txt
file. For data preparation, we may want to write the logs to amanifest.json
, or some other log output. (We may also want to align the behaviours of these two, and handle logging in a natural and consistent way for both? Relates to RMI-PACTA/pacta.portfolio.audit#2)Benefits of separating the error handling from the function
When writing unit tests, we often just want to know "Is the data being processed in the right way?", ie.
We don't necessarily want to care about the broader behavior of "should it fail if input data came from a step before that had Equity data but no Bonds data, or how do we handle data prep input that is missing some particular scenario that is crucial to one particular metric.
We want to be able to test the precise functionality of the function in the context of the output it produces in isolation.
How functions get combined downstream, and how they should behave when we are trying to generate a report is a separate (but equally important) concept.
Contrary to that, we have a logging function in this package: https://github.com/RMI-PACTA/pacta.portfolio.analysis/blob/main/R/write_log.R
Which is called in various contexts (and often many times) to log different aspects within different functions: https://github.com/RMI-PACTA/pacta.portfolio.analysis/blob/2d322adf82103cab16cc04ffc5e89421b300a431/R/get_input_files.R#L28-L34
This is a bit of an awkward implementation since it makes it difficult to run anything interactively (you would need to have a local logging path). You also need to decide WHICH messages to log (wouldn't it be easier to just automatically log ALL error and warning messages, with appropriate tags?). We are combining the "WHAT" and the "HOW" and it makes it difficult to test.
Extracting the logging and handling functionality would make it much easier to unit_test the actual purpose of functions in
pacta.portfolio.analysis
, while leaving error handling and logging to separate functions, that we can (for example) wrap function calls in when running it in a server or docker context. See a toy example of this functionality in the reprex below:Reprex
Created on 2023-04-13 with reprex v2.0.2
FYI. @cjyetman @AlexAxthelm