lebebr01 / simglm

Simulate regression models
https://simglm.brandonlebeau.org/
Other
43 stars 12 forks source link

Question : Model specification #46

Closed ns-rse closed 7 years ago

ns-rse commented 7 years ago

Hi,

Thanks for writing and sharing this package, its looks incredibly powerful and in turn useful.

I would be grateful if you've time for advice on model specification. I've been tasked with checking simulations someone else has performed (using Stata) for a longitudinal study of a Randomised Control Trial with two arms (group), adjusting for baseline (baseline), time (Time) and interaction between time and group. Since the study is longitudinal they have clustered at the individual level (id) and interacted treatment allocation (group) with time (month). With the data in long-format they have simulated multiple time points (n = 8) and specify the regression model along the lines of...

y ~ baseline + group*month + (1 | id)

...since interaction terms are automatically expanded (in this case to group + month + group * month).
My first approach to checking this with sim_pow_glm() is to fire up the Shiny example (a very neat addition by the way, thank you) and set the following parameters. I've listed the options and (put in brackets my thoughts on what the parameter should be, or whether I've left is as a default)

Parameter Option Comment
Model
Model two-level
Type of Nesting longitudinal
Sample Size
Sample Size Level 1 20 default for now
Sample Size Level 2 1 as individuals are the unit of clustering?
Random Errors
Level 1 Error Variance 5 default for now
Var int 1 default for now
Var time 1 default for now
Covariate Details
Include Intercept yes default for now
Number of Covariates 4 One for each of baseline, group, id
Beta Int 1 default for now
Beta Time 1 default for now
Beta group.f 1 default for now
Beta baseline 1 default for now
Beta id 1 default for now
Beta Time*group.f 1 default for now
Mean group.f 0 default for now
Mean baseline 0 default for now
Mean id 0 default for now
SD group.f 1 default for now
SD baseline 1 default for now
SD id 1 default for now
Type group.f lvl1 default for now
Type baseline lvl1 default for now
Type id lvl2 default for now
Outcome
Type of Outcome continuous
Unblanced
Unbalanced lvl2 clusters No Since its individuals?
Unbalanced lvl3 clusters No No lvl3 clusters
Missing Data
Simulate missing data yes
Type dropout
Proportion 0.1
Discrete Covariates
Discrete Covariates 1 Required for group
Levels 2 Two groups
Type lvl1 Fixed effect
Covriate Misc
Specify covariate names yes To help my sanity
Cov1 group.f Discrete covariate for grouping
Cov2 baseline Adjustment for baseline
Cov3 id Individual identifier for random effects
Cov4 `Time * group.f Interaction (since used in simulations I'm attempting to check)
Discrete Covariates
Number 1 For group
Levels group.f 2 Two groups
Type group.f lvl1
Random Error Distribution
Change Random Dist? no

Problems

When I hit the Simulate Data button I'm told...

Error : $ operator is invalid for atomic vectors

...and the following from the console...

Warning: Error in : $ operator is invalid for atomic vectors
Stack trace (innermost first):
    133: sim_reg_nested
    132: sim_reg
    131: eventReactiveHandler [/usr/lib64/R/library/simglm/shiny_examples/demo/server.r#393]
    111: gen_code
    110: f
    109: %ni%
    108: random_missing
    107: missing_data
    106: eventReactiveHandler [/usr/lib64/R/library/simglm/shiny_examples/demo/server.r#471]
     86: miss_data
     85: lapply
     84: sapply
     83: exprFunc [/usr/lib64/R/library/simglm/shiny_examples/demo/server.r#491]
     82: widgetFunc
     81: func
     80: origRenderFunc
     79: renderFunc
     78: origRenderFunc
     77: output$gen_examp
      2: shiny::runApp
      1: simglm::run_shiny

Questions

  1. Beta for all four co-variates can be entered, but only mean and sd for three covariates can be entered, is this meant to happen or is it an artefact of having specified the co-variate name as an interaction term between two other co-variates?
  2. Am I specifying my model correctly or completely off the mark?

The code generated from Shiny (also a very nice feature) is included below should it be useful...

fixed <- ~ 1 + time + group.f + baseline + id + Time*group.f
random <- ~ 1 + time  ## Not what I'd like to have since I have to replicate simulations done as 1 + id
fixed_param <- c(1, 1, 1, 1, 1, 1)
random_param <- list(random_var = c(1, 1), rand_gen = 'rnorm')
cov_param <- list(mean = c(0, 0, 0, NULL), sd = c(1, 1, 1, NULL), var_type = c('lvl1', 'lvl1', 'lvl1', 'NULL'))
n <- 1
p <- 20
error_var <- 5
with_err_gen <- rnorm
data_str <- "long"
fact_vars <- list(numlevels = 2, var_type = lvl1)
unbal <- FALSE
unbalCont <- NULL
temp_nested <- sim_reg(fixed = fixed, random = random, fixed_param = fixed_param, 
random_param = random_param, cov_param = cov_param, k = NULL, n = n, p = p, 
error_var = error_var, with_err_gen = with_err_gen, data_str = data_str, fact_vars = fact_vars, 
unbal = unbal, unbalCont = unbalCont)
miss_clustvar <- NULL
miss_withinid <- NULL
missing_cov <- NULL
missing_type <- "random"
miss_prop <- 0.2
temp_miss <- missing_data(temp_nest, miss_prop = miss_prop, type = missing_type, clust_var = 
miss_clustvar, within_id = miss_withinid, miss_cov = missing_cov)

I'd like to first generate a data set that I can check the structure of before moving on to power simulations and varying the sample size, co-variate means/SDs etc. etc. but am unsure if I'm on the right track.

Thanks in advance for any assistance you can provide.

lebebr01 commented 7 years ago

Thanks for the question. Unfortunately, the Shiny application is completely and utterly broken. The application was working for last year's useR conference, however, since I have modified the package syntax slightly which makes the application unusable.

Fortunately, the Shiny application should be updated within the next month to generate correct code and hopefully not be utterly broken. I am talking about the package at JSM in early August, this is my deadline for this.

ns-rse commented 7 years ago

Thanks for taking the time to update me on the current state of play, appreciated.

I look forward to the revision. Good luck with your talk at JSM.

lebebr01 commented 7 years ago

Although not extensively tested, the Shiny app for this package should generate data and correct code for simple models.

You can install the latest version using:

devtools::install_github('lebebr01/simglm')

If you do try this and still are receiving errors, I'd appreciate if you passed those along so that I can hopefully fix those. The Shiny app is large, cumbersome, and having more than a single tester is helpful.

Cheers.

ns-rse commented 7 years ago

Hi Brandon,

Thanks for notifying me of the update and apologies for taking so long to get back to you, work has been keeping me busy.

I've installed the new version and had a play around with it. I'm not sure I've tested every possible option but here are a few things I noticed...

Discrete Co-variates

Increasing the number of Discrete Co-variates can cause errors. If there are three continuous co-variates and I attempt to set three discrete variables too then the 'Mean cov', 'SD cov' and 'Type cov' under Continuous Co-variates all show...

Error: missing value where TRUE/FALSE needed

...this may be by design but I expect there may be scenario's where people do have multiple discrete variables they wish to simulate.

Update Simulation

I played around with a few options then hit "Update Simulation" and got the error message....

Error: 5 parameters specified for 2 variables in design matrix

The options I'd selected ultimately were not the problem as starting a new instance of Shiny they worked ok, it seemed to be an issue with selecting and unselecting options. I realise this isn't a particularly useful description as its not reproducible, but does it perhaps suggest that the order in which options are selected/enabled is important.

Not much I'm afraid and even less helpful that I don't have time to fork and collaborate, but I hope its of some utility.

Neil

-- "We have the duty of formulating, of summarising, and of communicating our conclusions, in intelligible form, in recognition of the right of other free minds to utilize them in making their own decisions." - R.A. Fisher (1955)

Neil Shephard Clinical Trials Research Unit University of Sheffield

-----BEGIN PGP PUBLIC KEY BLOCK----- Version: Mailvelope 0.11.0 Comment: Email security by Mailvelope - https://www.mailvelope.com

xsBNBFTUl3QBB/95NT1dbXEVSsKj/uSCuQWZbFBJG49ymVKaoVCR8VQIg8sn K+nkDk0KMURqSSHUHsiaQTKPKtbx3iev0qNW9nIzqthpXRm4ENqH/8j9Y9Sl zegFVds3kZeq13bOI+XnYvPjScE7R4RoMV36+877/69/JyrtuW6hcbPI8zCw W4R5oT1hcP9OirVt04c4BmJ8RuCB3uboD3+ni8iq+3JG0bfOeIxa09/hfRuc 7Dj2VZ72KywJfie3yQ4Vb7lnhLKx/U9bL4YmGAnMu8duVL4P65wygRRJAZL/ 5Zb+6qr1WSaDs7TyWuiaca2vGPMbro7Tu4oK+uxjvKXU+WUWWALm8DerABEB AAHNKk5laWwgU2hlcGhhcmQgPG4uc2hlcGhhcmRAc2hlZmZpZWxkLmFjLnVr PsLAcgQQAQgAJgUCVNSXfAYLCQgHAwIJEDDJZVvX4MCuBBUIAgoDFgIBAhsD Ah4BAABmDAf8DDj9Fqa8akz3OesSNvIjA8o//z0dZdBgLKF+jkTWuGLxusIH Kc51/23L9lsQoWbYXr9ZlWhBtrusAvgAAtgKLjniBYQRG9F5TQIK++g4JA/B JTT+PkOyIQut1E1b96wmvRCU4U4W0qM78nd/jAYU3GPua303f5IHwkF+FyO4 4TxoQbbru0rcRKKcClnPEdjXpPTWX8qZh1Ym2w9pxPqAhSJMwH2FKCQ2emMG K+Sblp0G8Q0Klt1PJbL+hOomS+O7B8GqUW8f9vexCcKZxb9DiTBotZX1gRwP jQc3ND8q6Gc9t/zhJHWDJ5A8+DA1Q2T3+5Fiz9pe6vNiEiZxIkVjs87ATQRU 1Jd8AQgAsWn066AjVY9Im45aZyuqZBlE4LqGPXiP2plqnqOROg810vC7Bfch aHa51a0G2xyej8jjOBmBS3QMRgtCZOhH32kv7and570bTn+g1Mo8QR8VWicq m2Zu/JQWFs7w8IzmIU8OIb3ybjHH+0KEMSf06XVvC3zYmPsYXEcCMamblAm2 IgazevGULlk4IihFcyc91STlhblURx2DyjLB/AtQRtq3qQHp80YezyMW5lx7 pl/ovZkrO+GK3IRrnNhzJIt0qY5eOQ4sYQwC4Z7G1rQRbl1TOfqF5wiD+5Ei 0FQfkmnhksETFWHd+MEiOmErUJixU0uS3EPBcJ7XfJvQmJuZOwARAQABwsBf BBgBCAATBQJU1JeICRAwyWVb1+DArgIbDAAAteAH/2TCRqYRC3f+pYzCYeNy tMwf9J4unv87xvVzd8zYZk+nd2TCx16fu4TniQrD4hCXJTKJIg8uKiIWHC6X /uxAd4aesPEepH81gyT2Kzfws0aZ9iERDeTWRCAL5fPNb6R5inXzL+vgwONb bI3egOZSysW+6Hv4hMAajbv3z8vPiLJC2nD6LtVRP4Njy0FD2HFBg9y6co5c bK5lDEbVRvOwz39krOw7+MTRS2+Sy0POdjy3Mg+nEJaL5XRvE9JCYNX/wb0p XaqGCJTToxfyOfA+UJqQd1yNGTVKN8A20oI3mrz/RBKYKClGyJpcFgfcx9ui YYcBAzk1OqtEN0dhYUUqAUI= =kSM+ -----END PGP PUBLIC KEY BLOCK-----

On 12 July 2017 at 20:51, Brandon LeBeau notifications@github.com wrote:

Although not extensively tested, the Shiny app for this package should generate data and correct code for simple models.

You can install the latest version using:

devtools::install_github('lebebr01/simglm')

If you do try this and still are receiving errors, I'd appreciate if you passed those along so that I can hopefully fix those. The Shiny app is large, cumbersome, and having more than a single tester is helpful.

Cheers.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/lebebr01/simglm/issues/46#issuecomment-314877733, or mute the thread https://github.com/notifications/unsubscribe-auth/AQwKxsp0gGcPzfKGKpsiavnijYuarke7ks5sNSPSgaJpZM4OGywB .

lebebr01 commented 7 years ago

I appreciate you taking the time to work through the Shiny app and these notes are very helpful in my attempt to get this app working as smoothly as possible. I've seen one of the error messages you obtained as well but could not directly isolate it, I need to spend more time with it.

If you are interested in progress of the package/shiny app passively watch this repo and also announcements at https://brandonlebeau.org