Closed bms63 closed 4 months ago
@bms63 agree! would you be interested in starting off this one? could even be a logrx vs logr example to start with and then others could add more alternatives if we advertised it across our slack community
@bms63 @rossfarrugia Howdy. I've been working on a page regarding logr
and logrx
but I was thinking that it might be worth trying a new strategy. Makefiles, docker and logging would make a nice addition to a section on reproducibility. Makefiles for starting everything in a similar way, docker for a reproducible environment and logging for step by step instructions on recreating the results / analysis. The README.md
gives me the impression that a page combining all these concepts together is more aligned with the goals of the book.
Before I get back to the drawing board, what do you think of this approach?
Hi @DavidBlairs - I think we still need the logr and logrx example. This is an ever-present questions and having this available to new users in a simple example would be very handy for us to showcase/point to in disucssions..
This example could then point to another section on reproducibility that goes into makefiles, docker/containers and thinking about building/maintaining these environments. I think the average stats programmer might find this section a bit overwhelming, but someone coming from a leadership position or designing this type of system might really appreciate our details and thoughts here. Hoping we could point to other existing initiatives and resources as well!
Thanks for digging into this!!
@bms63 good point, approachability is key.
On a similar note, if you were going to write up a page comparing the two packages, how would you approach it? I have two drafts at the moment but I'm not sure I'm happy with either of them. My main quandary is having an effective structure so I'd be grateful to hear any ideas on that front.
From my basic understanding, the two packages approach logging slightly differently. I think these differences should be the main highlight as both are trying to demonstrate reproducibility and control of the R programs.
For example,
logr
has more of a SAS approach to the R file where almost every step is recorded. This will make a quite large file for long files.
logrx
looks more to recording the environment and what was used in the R file. This is more compact.
I think the logs should be displayed in a scrollable iframe so it doesn't take up the entire screen!! See here for an example https://pharmaverse.github.io/logrx/articles/adsl_demo.html
@bms63 I get you. There's already great documentation on each package separately. Highlighting the distinction between the two allows the reader to select which one is best for them and avoids producing redundant documentation. Many thanks for the suggestion and good call on the scrollable iframes!
hi all, like the way this is heading from the above discussion and happy to take a look at any PRs. for the page on reproducible environments would be good to also share an example there referencing this pharmaverse package: https://pharmaverse.github.io/envsetup/main/. these are all great topics and definitely things we hear new companies adopting R have questions around.
Lots of folks want to know how to log R scripts. I think this is a good place to document how different companies are approaching it.