aalfons / robmed

Perform mediation analysis via a fast-and-robust bootstrap test, as well as various other methods
GNU General Public License v3.0
6 stars 0 forks source link

Add function to generate data from estimated mediation model #29

Closed aalfons closed 2 years ago

aalfons commented 2 years ago

Add function rmediation() to generate data from an estimated mediation model. This can be useful to carry out some simulations based on an empirical example. Some thoughts:

  1. For results form bootstrap tests, there should be an argument type = c("boot", "data") to select whether the bootstrap estimates or the estimates on the original data should be used in the mediation model to generate the data.
  2. There should be an argument to select whether the estimates of the regression coefficients should be used as the true values in the mediation model, or whether the true values should be sampled from a (normal) distribution with the point estimates as the mean and the standard errors as the standard deviation.
  3. There should be an argument to select whether the error terms should be drawn with replacement from the observed residuals, or whether they should be sampled from the model distribution (typically gaussian, but it can also be a skew-normal, t, or skew-t distribution if the model was estimated with package sn).
aalfons commented 2 years ago

Ad point 2: If the true values are samples from a (normal) distribution, this would require that the sampled values are stored as attributes of the generated data. I'm not convinced that that's a great solution. Let's skip this feature for now, perhaps it can be added at a later point.

aalfons commented 2 years ago

Ad point 3: Let's limit this to sampling with replacement from the independent and control variables, as well as the error terms from the residuals of the corresponding regressions. Otherwise it gets very complicated if for each of those variables and error terms a different distribution is specified, along with different parameters of those distributions.

aalfons commented 2 years ago

Ad point 1: I think that we need to use the estimates on the original data. Since the bootstrap estimates are means over the bootstrap replicates, we have that the bootstrap estimate of ab is no longer the product of the bootstrap estimates of a and b. Then a user may be confused as to what the true indirect effect is on the simulated data (namely the product of the bootstrap estimates of a and b, not the reported bootstrap estimate of the indirect effect).