BristolMyersSquibb / useR2024

Slides for useR2024 presentation
https://bristolmyerssquibb.github.io/useR2024
5 stars 0 forks source link

lm block #4

Closed JohnCoene closed 4 months ago

JohnCoene commented 4 months ago

@DivadNojnarg a suggestion for the "extend" section: a simple lm block.

Let me know what you think

DivadNojnarg commented 4 months ago

@JohnCoene I think it's a good idea, thanks. I am having issues to make it work, currently checking what is wrong.

JohnCoene commented 4 months ago

I can't any of the shinylive to work :(

DivadNojnarg commented 4 months ago

@JohnCoene Made some changes to simplifiy the block a bit (remove as.formula by using type = "name):

new_lm_block <- function(y = character(), predictor = character(), ...) {

  all_cols <- function(data) colnames(data)

  fields <- list(
    y = new_select_field(y, all_cols, title = "Y", type = "name"),
    predictor = new_select_field(predictor, all_cols, title = "Predictor", type = "name")
  )

  new_block(
    fields = fields,
    expr = quote({
      lm(.(y) ~ .(predictor), data = data)
    }),
    ...,
    class = c("lm_block", "transform_block")
  )
}

stack <- new_stack(
  data_block = new_dataset_block("penguins", "palmerpenguins"), 
  lm_block = new_lm_block("bill_length_mm", "body_mass_g")
)
serve_stack(stack)

That's the expression I get for this block:

data %>% {
    lm(bill_length_mm ~ body_mass_g, data = data)
}

We'd need a custom evaluate_block.lm_block for this one to work, which is a bit annoying to explain:


evaluate_block.lm_block <- function(x, data, ...) {

  stopifnot(...length() == 0L)

  eval(
    substitute(expr, list(expr = generate_code(x))),
    list(data = data)
  )
}

Maybe we can use another tool to create the lm having data as first param or we can also pass the expression like with data first but I find it confusing as people would wonder why they have to do this:

lm(data = data, formula = .(y) ~ .(predictor))
DivadNojnarg commented 4 months ago

@JohnCoene this works for me, the only weirdness is the fact that I pass the data as first param of lm but this avoids to create a custom evaluate method. Then by re-adding broom::tidy we get the result table:

library(blockr)
library(palmerpenguins)

new_lm_block <- function(y = character(), predictor = character(), ...) {

  all_cols <- function(data) colnames(data)

  fields <- list(
    y = new_select_field(y, all_cols, title = "Y", type = "name"),
    predictor = new_select_field(predictor, all_cols, title = "Predictor", type = "name")
  )

  new_block(
    fields = fields,
    expr = quote({
      model <- lm(data = data, formula = .(y) ~ .(predictor))
      broom::tidy(model)
    }),
    ...,
    class = c("lm_block", "transform_block")
  )
}

stack <- new_stack(
  data_block = new_dataset_block("penguins", "palmerpenguins"), 
  lm_block = new_lm_block("bill_length_mm", "body_mass_g")
)
serve_stack(stack)
DivadNojnarg commented 4 months ago

Added a commit to propose this suggestion. Feel free to modify. Maybe you also want to open on the customise section saying: "As we showed before with the lm_block, we had to pass data first to use the default method. If we don't put data as first param, you can create our own evaluate_block.lm_block method to handle this". Not sure if we have time to go the that level but that's a good transition.

JohnCoene commented 4 months ago

As mentioned I thought of changing the example in extend from what we have to "wrapping a function." But on second thought it's not a very good idea given the example we'd have to write a function that builds a formula and it's actually more complicated than with blockr fields type = "name"

Good to merge for me