Closed decastrof closed 7 years ago
I think difference is because notBoxPlot
calculates the 95% CI of each group separately whereas your model bases each CI on the variance of all the groups together.
Here's your model with the data as you sent me:
Linear regression model:
resp ~ 1 + treat
Estimated Coefficients:
Estimate SE tStat pValue
________ _______ ________ ________
(Intercept) -0.092 0.14985 -0.61395 0.5563
treat_F -0.5984 0.21192 -2.8237 0.022366
Number of observations: 10, Error degrees of freedom: 8
Root Mean Squared Error: 0.335
I re-define resp
to make a couple of values in only the second group larger:
resp= [0.084 -0.1 -0.202 -0.267 0.025 -0.58 -0.782 -1.149 1.021 -2.962]';
Now re-fit the model:
Linear regression model:
resp ~ 1 + treat
Estimated Coefficients:
Estimate SE tStat pValue
________ _______ _______ _______
(Intercept) -0.092 0.45297 -0.2031 0.84413
treat_F -0.7984 0.6406 -1.2463 0.2479
Number of observations: 10, Error degrees of freedom: 8
Root Mean Squared Error: 1.01
The SE for the intercept term has also gone up. So if I fit your original model with only the first variable I get:
Linear regression model:
resp ~ 1 + treat
Estimated Coefficients:
Estimate SE tStat pValue
________ ________ _______ _______
(Intercept) -0.092 0.066126 -1.3913 0.23653
Number of observations: 5, Error degrees of freedom: 4
The CI of that is:
>> coefCI(M3,0.05)
ans =
-0.2756 0.0916
Which is the same as:
>> [mean(A)-tInterval_Calc(A),mean(A)+tInterval_Calc(A)]
ans =
-0.2756 0.0916
That's what notBoxPlot
uses if you ask for a CI based on the t interval. By default it uses the normal distribution which gives you:
>> [mean(A)-SEM_calc(A),mean(A)+SEM_calc(A)]
ans =
-0.2216 0.0376
Not sure what is the best way forward. Clearly it depends on whether the user wants pooled sum of squares or not. In this case what are you trying to achieve?
I have a thought. What if notBoxPlot
has a new call style that looks like this:
>> F=fitlm(T,'Var1 ~ 1+Var2');
>> notBoxPlot(F)
The raw data are in F.Variables
, so everything is possible with just the fit object and it will be really obvious that notBoxPlot
is doing something different to what it normally does.
Thanks Rob. Very thorough. Actually, your suggestion would be perfect. To have a call style that matches the model already fitted, so the figure shows the same info as the analysis. FdC
I'll see if I can find some time soon to do this.
Right!
This commit: ea0ad26c5926b37d59f99de5d7d79b6aa3326acc on the dev branch should do what you need. Can you see if it works? Either checkout the dev branch or download the zip from this page. Take a look at the examples by running NBP.linearModelExamples
and then see if the new notBoxPlot
call scheme works for your data.
If it's good I'll tidy it up and merge it into the main branch.
EDIT: in case it's not obvious, the blue 1SD bar becomes 1.96 SDs when you supply a LinearModel
It works perfectly with my data. Except if the predictor in the linear model is a cell array. In that case I get an error: Error using double Conversion to double from cell is not possible. Error in notBoxPlot (line 188) x=double(x);
This can be avoided making the predictor categorical: X.Predictor= categorical(X.Predictor);
Thanks! Great work
I'll look into the best way to fix this. I want to make sure that coercing the independent variable to categorical doesn't have some other unintended problem relating to the model fit or what the user intended to do.
Comparing the confidence intervals returned by notBoxPlot and those from 'coefCI' applied to a model fitted with 'fitlm' they don't match. Not at all. Any suggestions? I include my data and code below. The groups have normal distribution (lilliefors test).
clear all resp= [0.084 -0.1 -0.202 -0.267 0.025 -0.58 -0.782 -1.149 0.021 -0.962]'; treat= {'D','D','D','D','D','F','F','F','F','F'}'; D= table(resp,treat); M2= fitlm(D, 'resp ~ 1 + treat'); coefCI(M2,0.05) ans = -0.43756 0.25356 -1.0871 -0.10971
x =[ 1 1 1 1 1 2 2 2 2 2]; notBoxPlot(D.resp,x)