Open christofs opened 2 years ago
I took a stab at this today. https://gist.github.com/jmclawson/52252349dd100e426c2267b5de48aade
Does the code make sense as you imagine it, @christofs ? There are mainly two functions it makes available.
The first, stylo_log()
, accepts a stylo()
object that has just been created, and it logs the date and time, the stylo call, and the config file. It's used like this:
# option 1: pipe from stylo() into stylo_log()
stylo() |> stylo_log()
# option 2: enclose stylo() in stylo_log()
stylo_log(stylo())
# option 3: call stylo_log() on a stylo object immediately after creating it:
my_object <- stylo()
stylo_log(my_object)
Options exist, including log_label
to redefine the label of the folder and log file, add_dir_date
to add a date to the folder name (by default it doesn't do this), and log_date
for appending a date to the end of the text file (with a default value of Sys.Date()
). At its simplest, stylo_log()
will create a folder called "stylo_log" containing a text log file for each day analyses are run.
At the same time that it appends the call and configuration to a log file, it also copies any files made at the same time as stylo_config.txt into the directory used for logging, prepending each of their file names with the date and time they were originally created.
The second function, stylo_replicate()
, is a little more complex. It will do two things:
date_time
argument, it will run both stylo()
and stylo_log()
, passing along the log_label
, add_dir_date
, and log_date
arguments to stylo_log()
, while passing along ...
to stylo()
. It's used like this: stylo_replicate()
(with the parentheses accepting anything that will work with stylo()
)date_time
is passed as an argument, it will parse the appropriate log (with "appropriate" defined by defaults or by the log_label
, add_dir_date
, and log_date
arguments) created by stylo_log()
to find the settings used for a previous analysis from that date and time, and it will re-run the analysis using the same settings, and add an item to the log. It is used like this: stylo_replicate("2023-01-27 13:46:26")
Sounds cool! Will report back as soon as I was able to do a test run.
This is just an idea or a suggestion. With the way stylo works at the moment, it is not very easy to create results that can easily be replicated or precisely documented. The reasons for this, if my probably superficial analysis is correct, have something to do with the following aspects:
In practice, this means people need to copy this stuff to a new folder whenever they think an analysis is good and should be kept. More often than not, by the time you realize this, the "stylo_config" is already overwritten and the table of frequencies wasn't saved or was overwritten in the meantime as well.
A simple solution for this could be a "replication mode" or "documentation mode" that can be activated when calling stylo. One could simply say: "documentation=TRUE". Then, the following things would happen:
Of course, this creates a lot of data. Folders that turn out not to be useful need to be deleted at some point. But at least no data is lost.
To replicate an analysis, one simply needs to set the working directory to the right time-stamped folder and run stylo again to repeat (and then possibly vary) the analysis. Maybe a parameter like "replication=TRUE" could be used to activate all parameters necessary, for instance to make sure stylo uses the frequency table from the documentation.
Maybe not quite thought out to the end, but something along these lines might be useful.