RobinDenz1 / simDAG

An R-Package to Simulate Simple and Complex (longitudinal) Data from a DAG and Associated Node Information
https://robindenz1.github.io/simDAG/
GNU General Public License v3.0
8 stars 0 forks source link

Adding variables to the formula #7

Closed abduazizR closed 2 weeks ago

abduazizR commented 2 weeks ago

Is it possible to specify variables to the formula argument?

Here is an example that I would appreciate if there is some way for it to work

my_effect <- 0.2

# Not working
dag <- empty_dag() +
  node("L", type = "rbernoulli") +
  node("A", type = "binomial", formula = ~ 0.2 + L * my_effect)

dag_data <- sim_from_dag(dag, n_sim = 10000)

# Not working
glm(A ~ L, data = dag_data, family = binomial("logit"))

It gives me an error message.

Thank you

RobinDenz1 commented 2 weeks ago

You should be able to put eval() around the my_effect variable and it should work fine usually. Unfortunately in your specific example it still doesn't due to a known bug that I still have to fix which occurs whenever every beta-coefficient is wrapped in a function call. Will be fixed soon. Just to showcase it though, this works:

my_effect <- 0.2

dag <- empty_dag() +
  node("L", type = "rbernoulli") +
  node("D", type="rbernoulli") +
  node("A", type = "binomial", formula = ~ 0.2 + L * eval(my_effect) + D * 1)

dag_data <- sim_from_dag(dag, n_sim = 1000000)

glm(A ~ L + D, data = dag_data, family = binomial("logit"))

and once I fixed the mentioned bug, your example will also work if you put eval() around my_effect.

abduazizR commented 2 weeks ago

Thanks for the prompt response. The example you created works fine. However, when I remove the node D , it breaks.

library(simDAG)

my_effect <- 0.2

dag <- empty_dag() +
  node("L", type = "rbernoulli") +
  node("A", type = "binomial", formula = ~ 0.2 + L * eval(my_effect))
#> Warning in node("A", type = "binomial", formula = ~0.2 + L * eval(my_effect)):
#> Using regular formulas in 'formula' was deprecated in version 0.2.0 and will no
#> longer be supported in the next version of this package. Please use the new
#> custom formulas instead.

dag_data <- sim_from_dag(dag, n_sim = 1000000)
#> Error: An error occured when processing node 'A'. The message was:
#> Error in terms.formula(object, data = data): invalid model formula in ExtractVars

glm(A ~ L, data = dag_data, family = binomial("logit"))
#> Error in eval(mf, parent.frame()): object 'dag_data' not found
RobinDenz1 commented 2 weeks ago

Yes, that's what I was trying to tell you. There currently is a bug where at least one regular combination of VARIABLE * NUMBER must be present in formula for it to work. Will hopefully be fixed soon

RobinDenz1 commented 2 weeks ago

The last update should fix this issue. All of your examples should now work, once you installed the developmental version from github.