ropensci / jagstargets

Reproducible Bayesian data analysis pipelines with targets and JAGS
https://docs.ropensci.org/jagstargets
Other
10 stars 6 forks source link

Allow tar_jags_rep_summary() to accept a target as jags_files argument #13

Closed rich-payne closed 3 years ago

rich-payne commented 3 years ago

Prework

Proposal

It would be great if the jags_files argument of tar_jags_rep_summary() could accept a target, rather than a character vector. The purpose of this would be to allow tar_jags_rep_summary() to update automatically if a file with jags code is updated. I'm thinking something like the following:

_targets.R file:

library(targets)
library(jagstargets)
options(crayon.enabled = FALSE)
tar_option_set(memory = "transient", garbage_collection = TRUE)

generate_data <- function (n = 10L) {
  alpha <- stats::rnorm(n = 1, mean = 0, sd = 1)
  beta <- stats::rnorm(n = n, mean = 0, sd = 1)
  x <- seq(from = -1, to = 1, length.out = n)
  y <- stats::rnorm(n, x * beta, 1)
  # Elements of .join_data get joined on to the .join_data column
  # in the summary output next to the model parameters
  # with the same names.
  .join_data <- list(alpha = alpha, beta = beta)
  list(n = n, x = x, y = y, .join_data = .join_data)
}

write_jags_file <- function() {
  path <- "model.jags"
  lines <- "model {
    for (i in 1:n) {
      y[i] ~ dnorm(alpha + x[i] * beta[i], 1)
      beta[i] ~ dnorm(0, 1)
    }
    alpha ~ dnorm(0, 1)
  }"
  writeLines(lines, path)
  path
}

list(
  tar_target(
    jags_file,
    write_jags_file(),
    format = "file"
  ),
  tar_jags_rep_summary(
    validation,
    jags_file,
    data = generate_data(),
    parameters.to.save = c("alpha", "beta"),
    batches = 5, # Number of branch targets.
    reps = 2, # Number of model reps per branch target.
    stdout = R.utils::nullfile(),
    stderr = R.utils::nullfile(),
    variables = c("alpha", "beta"),
    summaries = list(
      ~posterior::quantile2(.x, probs = c(0.025, 0.975))
    )
  )
)

Unfortunately, right now I get the error Error in assert_chr(jags_files, "jags_files must be a character vector") : object 'jags_file' not found for this pipeline.

wlandau commented 3 years ago

The purpose of this would be to allow tar_jags_rep_summary() to update automatically if a file with jags code is updated.

This is already baked into all the tar_jags_*() target factories. Unfortunately, jags_files cannot be a target because the factories statically branch over files (which thus must exist in advance).

library(targets)

tar_script({
library(jagstargets)
tar_jags_rep_summary(
  validation,
  c("model1.jags", "model2.jags"),
  data = generate_data(),
  parameters.to.save = c("alpha", "beta"),
  batches = 5, # Number of branch targets.
  reps = 2, # Number of model reps per branch target.
  stdout = R.utils::nullfile(),
  stderr = R.utils::nullfile(),
  variables = c("alpha", "beta"),
  summaries = list(
    ~posterior::quantile2(.x, probs = c(0.025, 0.975))
  )
)
})

tar_manifest(contains("file"), fields = c("name", "command", "format"))
#> # A tibble: 2 x 3
#>   name                   command           format
#>   <chr>                  <chr>             <chr> 
#> 1 validation_file_model1 "\"model1.jags\"" file  
#> 2 validation_file_model2 "\"model2.jags\"" file

tar_visnetwork()

Created on 2021-03-18 by the reprex package (v1.0.0)