MUCollective / multiverse

R package for creating explorable multiverse analysis
https://mucollective.github.io/multiverse/
GNU General Public License v3.0
62 stars 5 forks source link

supported syntax for branches #80

Closed abhsarma closed 2 years ago

abhsarma commented 4 years ago

I noticed Alex trying to use branch in two ways which are not supported currently:

let's say we have a dataframe

df <- data.frame(y = 1:10)

in this one, the pkg is not able to parse the expression with assignment:

branch(param_1,
  "option1" ~  df <- df %>% mutate(x = 1)
  "option2" ~ df <- df %>% mutate(x = 2)
)

and in this one, the pipe is making things more difficult to read:

df <- df %>%
    branch(some_transformations,
        "option1" ~ mutate(x = 1),
        "option2" ~ mutate(x = 2)
    )

should we be supporting these? I don't think it would be too difficult to solve these, but I guess should we impose some structure on we allow users to call branch? or should users be able to call it on whatever, as long as removing it results in some valid R code?

mjskay commented 4 years ago

should users be able to call it on whatever, as long as removing it results in some valid R code?

Yes, I think so. That is the simplest and most general solution. So:

1) Why doesn't the assignment one work?

2) The pipe one works currently, so should continue working right?

abhsarma commented 4 years ago

ah sorry this does not work:

df <- data.frame(
  y = 1:10,
  x = runif(10, 0, 10)
)

and then in multiverse:

df <- df %>%
    branch(some_transformations,
        "option1" ~ filter(x > 5) %>% mutate(z = 1),
        "option2" ~ filter(x > 3) %>% mutate(z = 1)
    )

So both don't work for the same reason --- because of the pipe or <- it runs into some problems parsing the expression. Should be an easy fix, I'll try to do it tonight!

abhsarma commented 4 years ago

the reason I don't think its a particularly good way of writing code is that then it puts the locus of the change further up the chain. i.e. the user might end up writing huge blocks of code in each branch, and then the multiverse does not become very useful. For instance, what Alex is trying to do in his examples.

If instead he used functions the code would be much simpler and less redundant.

mjskay commented 4 years ago

Right but that's down to coding style, and it's hard to anticipate when that will be good or bad. Better to have a general, simple, and consistent interface than one that tries to guess what the user is doing or enforce a particular style.

(nevermind I misread the second example)

mjskay commented 4 years ago

Re: the giant nested branches, did you see our slack convo? That came down to him not realizing you can replace arbitrary sub expressions with a branch, sounds like he's going to fix it.

abhsarma commented 4 years ago

yeah makes sense. I was just thinking of fixing it but not documenting it so that people just don't just jump to use it that way. It's cognitively low effort, but not super good use of the package.

I did see the slack convo, but based on what he was trying to do, I don't think you can avoid the large blocks of code (at least in some parts) just because not everything should be done in a tidy workflow always...?

mjskay commented 4 years ago

yeah makes sense. I was just thinking of fixing it but not documenting it so that people just don't just jump to use it that way. It's cognitively low effort, but not super good use of the package.

Sorry I lost the thread, which part are you talking about here?

I did see the slack convo, but based on what he was trying to do, I don't think you can avoid the large blocks of code (at least in some parts) just because not everything should be done in a tidy workflow always...?

Hmm, dunno, I'd have to look more closely. From my initial look it seemed like stuff could be simplified.

abhsarma commented 4 years ago
branch(param_1,
  "option1" ~  df <- df %>% mutate(x = 1)
  "option2" ~ df <- df %>% mutate(x = 2)
)

I'm trying to solve this, but I think the problem with supporting syntax like this is that, I am unable to use parse or rlang::parse_expr to convert text like this x <- "branch(model, 1 ~ data = df)" to an expression:

x <- "branch(model, 1 ~ data = df)"
parse(text = x)
mjskay commented 4 years ago

Yeah the problem is it parses as a parameter assignment to the name 1 ~ data, which is syntactically incorrect.

I don't think you need to fix this, users just need to do this instead:

branch(param_1,
  "option1" ~ {df <- df %>% mutate(x = 1)}
  "option2" ~ {df <- df %>% mutate(x = 2)}
)
abhsarma commented 4 years ago

right...I encountered this when I was trying to specify different arguments to a function using branch. For example

f <- function(a, b, c) {...}

x <- f(
        a = y,
        branch(args,
            "arg1" ~ b = 2,
            "arg2" ~ c = 2)
    )

This means that there is a limitation and something like this cannot be done?

abhsarma commented 4 years ago

nevermind what you suggested works for this as well!

I'll document this clearly but this is good to know!

mjskay commented 4 years ago

oh that's cool! I didn't know it would work for function arguments too, that is good to know! :)