crtahlin / medplot

Functions for drawing graphs in R visualizing medical information.
4 stars 2 forks source link

Add a mixed model #77

Closed crtahlin closed 10 years ago

crtahlin commented 10 years ago

In the draft of the article, it is mentioned: "TODO: need to add an analysis tool that takes into account the measurements made over time, using a mixed model. Missing so far, something needs to be included."

What kind of model, what needs to be done?

llaarraa commented 10 years ago

Example for a binary outcome

library(lme4)
j=10

#the response is binary
y=ifelse(dataFiltered[,j]>0, 1, 0)
my.var=dataFiltered[,input$groupingVar]

########### mixed model: including only one variable
my.mod=glmer(y~my.var + (1|PersonID) , family=binomial,  na.action=na.omit, data=dataFiltered)

OR=exp(summary(my.mod)$coef[2,1])
CI.OR=exp(summary(my.mod)$coef[2,1]+c(-1, 1)*qnorm(.975)*summary(my.mod)$coef[2,2])
p.OR=summary(my.mod)$coef[2,4]

later we can include also measurement occasion (as factor in the model), and time (days from baseline)

crtahlin commented 10 years ago

How should the output look like - a table with a line for each of the symptoms? E.g.:

Variable | Odds ratio | 95% conf. interval | P value Fatigue | 1.2 | 0.9 to 1.3 | 0.04 Malaise | .....

It should probably output only if "Treat and analyse variables as binary?"=TRUE ?

llaarraa commented 10 years ago

The above is fine for binary variables

for numerical adapt the following code:

library(lme4)
library(lmerTest) ### add to required libraries
j=10

#the response is numerical
y=dataFiltered[,j]
my.var=dataFiltered[,input$groupingVar]

########### mixed model: including only one variable
my.mod=lmer(y~my.var + (1|PersonID) , na.action=na.omit, data=dataFiltered)

beta=summary(my.mod)$coef[2,1]
#CI.beta=summary(my.mod)$coef[2,1]+c(-1, 1)*qnorm(.975)*summary(my.mod)$coef[2,2]
#confidence intervals 
CI.beta=confint(my.mod)[4,]
p.beta=summary(my.mod)$coef[2,5]

as for the prevsious example: report beta instead of OR, 95% CI, P value

llaarraa commented 10 years ago

To these tabs (numerical or binary) add a graph that summarizes the results of the models, as it appears now in logistf. Could be done using ggplot, you have the point estimates and the lower and upper limit of the CI, plot those

llaarraa commented 10 years ago

Last thing that I would like to add for this part: adjustment for the time where the measurement was taken

The code to do it (similar to those reported above) is below:

################### adjust the analysis for measurement occasion

#the response is binary
y=ifelse(dataFiltered[,j]>0, 1, 0)
my.var=dataFiltered[,input$groupingVar]

########### mixed model: including only one variable and occasion measurement
my.mod=glmer(y~ as.factor(Measurement) +  my.var + (1|PersonID) , family=binomial,  na.action=na.omit, data=dataFiltered)

my.summary=summary(my.mod)$coef
num.coef=nrow(my.summary)

OR=exp(my.summary[num.coef,1])

CI.OR=exp(summary(my.mod)$coef[num.coef,1]+c(-1, 1)*qnorm(.975)*summary(my.mod)$coef[num.coef,2])

p.OR=summary(my.mod)$coef[num.coef,4]

############# numerical variables

j=10

#the response is numerical
y=dataFiltered[,j]
my.var=dataFiltered[,input$groupingVar]

########### mixed model: including only one variable
my.mod=lmer(y~ as.factor(Measurement) + my.var + (1|PersonID) , na.action=na.omit, data=dataFiltered)

my.summary=summary(my.mod)$coef
num.coef=nrow(my.summary)

beta=summary(my.mod)$coef[num.coef,1]
#CI.beta=summary(my.mod)$coef[2,1]+c(-1, 1)*qnorm(.975)*summary(my.mod)$coef[2,2]
#confidence intervals 
CI.beta=confint(my.mod)[num.coef,]
p.beta=summary(my.mod)$coef[num.coef,5]

Here it would be nice to report also the estimated effect for time, maybe graphically, -- estimates and CI. The relevant data (estimates and CI can be obtained from the output of the model - similarly as done for beta)

Ideally: the user goes to the tab, decides if the analysis should be adjusted for the time of measurement or not - even more ideally: has two options : to use measurement occasion as factor, or to adjust for the number of days since inclusion in the study - the same variable used for the timeline graph. Accordingly to the choice , the appropriate model is fitted

crtahlin commented 10 years ago

To recap, it would be ideal to give user three options:

crtahlin commented 10 years ago

Tables implemented in d7471ab8c0026249f983d7b8b7e14964a248075f.

llaarraa commented 10 years ago

tried it: I get this error ERROR: could not find function ".mixedModel"

crtahlin commented 10 years ago

Will check. But at the moment I have another bug I have to fix first - the output is for the wrong coeficients.

@llaarraa : Does the order of the terms in the model matter (I think it does in principle, but does it matter in our case)? For example, now we have: y~ as.factor(Measurement) + my.var + (1|PersonID) Could we have: y~my.var + as.factor(Measurement) + (1|PersonID)

crtahlin commented 10 years ago

Tables are nearly done - but I am getting a lot of warnings (in the console, not visible to user) when treating variables as binary:

 ...
 Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
 Model is nearly unidentifiable: very large eigenvalue
 - Rescale variables?;Model is nearly unidentifiable: large eigenvalue ratio
 - Rescale variables?
 Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
 Model failed to converge with max|grad| = 0.228017 (tol = 0.001) 
 ...
crtahlin commented 10 years ago

Tables implemented in d671339e4ed030d26dd12b327e204f53e6abe5da . (Time/measurment do not play much of a role - their coefficients are close to zero?) Graphs remain to be implemented.

crtahlin commented 10 years ago

Implemented all the graphs in 82d7aa4817ae9b60e85a3724c21824d780649e34, e478f82d5f3c58d5678fc9f749b05523b703d500, c83eef9f72b255b69b2a4bcfac0b7cda82df9a50. Closing issue.

crtahlin commented 10 years ago

Changed some function names, so that they don not start wit ".". Now they get exported and can be "found" by the app. (This was the problem with ".mixedModels" mentioned above somewhere.) Commit 6b1a338b90573d10059adce64784d56e2ca9c37e.

crtahlin commented 10 years ago

After rm(list=ls()) some problems with "treatasBinary=FALSE" combination:

Error in print(plotFixedEffectsofGroupingVar(calculatedStatistics = mixedModelResults()    [["groupingVar"]],  : 
  error in evaluating the argument 'x' in selecting a method for function 'print': Error in     summary(model)$coef[groupingVarCoefName, "Pr(>|t|)"] : 
  subscript out of bounds

Reopening.

crtahlin commented 10 years ago

If I source() the file containing mixedModel() I get no errors. If the function is not sourced, but loaded via the medplot package, than the sumary() function does not return P values, that should be provided by the package lmerTest that extends the lme4 packege. IT looks a though only the summary of lme4 is returned. Although the model objct is of the type merModLmerTest, which is supposedly OK.

Researching what the problem could be... Something with loading the packegages I guess, but at them moment no good ideas.

crtahlin commented 10 years ago

Fixed in bc68daef5b54b1a806220fd54d7207ba5e27a12b . Added explicit use of the right summary() function via lmerTest::summary() call. Closing.

(Not sure why the wrong summary() function was used, since the lmerTest package was loaded after the lme4?)