yihui / knitr

A general-purpose tool for dynamic report generation in R
https://yihui.org/knitr/
2.37k stars 874 forks source link

A logging solution for debugging `knitr` and dependent packages #2203

Open leogama opened 1 year ago

leogama commented 1 year ago

Hi, @yihui. I've been working on solutions for Python in knitr with reticulate. I'm also exploring and extending (for me) features of rmarkdown and bookdown, including some templating with jinjar. It's been all fun and games, but the interdependence and complexity of these separate packages is starting to slow down my adventures...

Beyond that, I hit a nasty file location bug recently and I'm unable to use some standard R debugging tools like browser() or recover() to inspect the environment at the failure point because it happens many levels deep (an R function called from a Python hook, called by a knitr block, possibly in a child process). I thought of using some "cat/print" calls, but there's another problem as stdout and stderr are both used in my setup for one thing or another and I'm never sure if the message could be swallowed somehow by knitr itself...


I'd like you to consider adding some logging facility to knitr. It may be either a standalone package like logging or a streamlined version of it included in the source. I'm thinking of something that could:

For example, knitr could expose a logger object, like implemented in Python's and R's logging, that would be used from other interacting packages by calling knitr:::logger$debug("sample message"). The code base(s) could then have some basic messages like "entering R/Python block", "running after hooks", "reading results from cache", etc. predefined for localizing the stage of code execution, and when debugging something we would be able to add a couple of temporary extra logging messages without needing to care about the whole logging setup.

I'm motivated to write a draft PR, but I wanna know your opinion first.


By filing an issue to this repo, I promise that

I understand that my issue may be closed if I don't fulfill my promises.

yihui commented 1 year ago

I don't have much experience here, but if the (initial) PR is not complicated, I can definitely have a look. Thank you!

leogama commented 1 year ago

I don't have much experience here, but if the (initial) PR is not complicated, I can definitely have a look. Thank you!

Great! It's not meant to be complicated, just need to get a couple of details right. Performance impacts should be negligible. And I don't think this feature should be exposed as public API, it's solely to ease debugging and development.

Are you fine with a new dependency on the package logging (CRAN)? It has no sub-dependencies and implements the same API as Python's logging module, which is nice.

yihui commented 1 year ago

Adding soft dependencies (i.e., Suggests) is almost always okay to me.

Keeping this functionality internal is also okay. Thanks!

cderv commented 1 year ago

I like the idea of debugging options. I think it should be really lightweight, for internal debugging mainly. Otherwise it could become quite verbose.

What motivates the logging package ?

I know about more recent option in logger 📦 which I like the flexibility https://daroczig.github.io/logger/

Or a very lightweight option useful that makes add it as suggest package quite easy https://cran.rstudio.com/web/packages/debugme/index.html Licenced MIT

Anyway, just adding some other choices I am more used to. There are several logging packages - not sure what is the best for us.

leogama commented 1 year ago

Thanks for the suggestions, @cderv. I had seen logger when looking for options, but didn't know about debugme —it's really interesting!

I like the idea of debugging options. I think it should be really lightweight, for internal debugging mainly. Otherwise it could become quite verbose.

Yes, I'm proposing this only for debugging and development of knitr and packages that extend it, not for users.

What motivates the logging package ?

I've suggested logging because it's stable, implements the nice Python logging API and, more important, it has no dependencies. But there are alternatives with only a few dependencies.

I know about more recent option in logger package which I like the flexibility https://daroczig.github.io/logger/

Oh! I just noticed that logger's dependencies are in "Suggests", so this is fine too.

Or a very lightweight option useful that makes add it as suggest package quite easy https://cran.rstudio.com/web/packages/debugme/index.html Licenced MIT

This logging approach is very attractive, ~however it doesn't seem to be~ and it is able to log to a file. As knitr can run code many levels deep and generally sinks stderror output, this is an important requisite.


Now, I'm not sure if @yihui is willing to accept a lightweight package as hard dependency. If not, than we need to find a package with a logging API that can be set up, exposed and called from other packages (e.g. rmarkdown) conditionally. But if such a solution exists and is practical to use, we won't even need to argue about dependencies...

debugme is incredibly easy to set up and enable/disable with multiple packages. ~As we only need a single output stream to a single file, I'm thinking of writing a small extension to be able to redirect debugme's output and take advantage of it's cool features.~

If you have any other thoughts, please share! But let me brew these ideas for a few days.

cderv commented 1 year ago

You can write to file with debugme. Everything is controled by env var externally by the user. See redirecting output section in the help page https://rdrr.io/cran/debugme/man/debugme.html

leogama commented 1 year ago

I will try adding this snippet to .onLoad():

if (requireNamespace('debugme', quietly = TRUE))
    debugme::debugme()
yihui commented 1 year ago

I'd prefer an explicit opt-in like if (getOption('knitr.enable.debugme', FALSE)) debugme::debugme(), otherwise debugme will always be loaded if installed when loading knitr.

leogama commented 1 year ago

Well, the DEBUGME environment variable must be set for debugme to do anything. Shouldn't we check this instead of creating a redundant setting?

Example:

if (Sys.getenv('DEBUGME') != '' && requireNamespace('debugme', quietly = TRUE))
    debugme::debugme()
yihui commented 1 year ago

Sure. That's definitely more appropriate!