rpietro / QoLFunctionalCapacityAVBlock

Analysis of long term QoL of patients with AV blockade
0 stars 0 forks source link

What does mean "Multiple R-squared"? #8

Open katiasilva opened 12 years ago

katiasilva commented 12 years ago

When I use this model bellow:

qplot(qrs_duration_paced, qrs_duration_inhibition) + geom_smooth(fill="cornflowerblue", method = "loess", size = 1) model10 <- lm(qrs_duration_paced ~ qrs_duration_inhibition) summary(model10)

I have this answer: Residuals: Min 1Q Median 3Q Max -36.739 -12.739 1.126 11.543 33.990

Coefficients: Estimate Std. Error t value Pr(>|t|)
(Intercept) 100.1798 16.8280 5.953 4.28e-07 ***

qrs_duration_inhibition 0.5729 0.1877 3.053 0.00388 \

Signif. codes: 0 ‘_**’ 0.001 ‘_’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 17.8 on 43 degrees of freedom (21 observations deleted due to missingness) Multiple R-squared: 0.1781, Adjusted R-squared: 0.159 F-statistic: 9.32 on 1 and 43 DF, p-value: 0.003878

I was wondering if "Multiple R-squared" could be interpreted was a correlation index?

Thank you, Katia

rpietro commented 12 years ago

short answer is no. r squared is a measure of how much the model explains and how much is left for chance. so, if the value is, say 0.35, then the variables in the model explain 35% of what can happen in terms of the outcome, 65% being left to chance. multiple r squared is that same measure, but taking into account the number of variables in the model. the more variables you include in the model, the more penalized your r square will be. some links:

http://goo.gl/hMdNF http://goo.gl/2gsiS http://goo.gl/lrzOa http://goo.gl/wG6z1

Joao, Mathias, Jacson, and Jose Eduardo -- as you guys know, this is the kind of information that would go into our toolbox documentation, each data analysis method being described in terms of an input (what variables are required to run the method), an output (what gets out of the method and how it is interpreted), and an algorithm (whatever might turn an input into an output). now, the future of the plot ontology (Mathias, this is a system that will tell you that if you have variable x, y and z, your alternatives for graphically displaying those variables are a, b, and c) is to have characteristics of not only plots but any type of data analysis method. the basic framework for the ontology section describing statistical methods will be the toolbox. in other words, we will have ontology classes describing input, algorithm, and output

On Thu, Oct 11, 2012 at 11:25 AM, katiasilva notifications@github.comwrote:

When I use this model bellow:

qplot(qrs_duration_paced, qrs_duration_inhibition) + geom_smooth(fill="cornflowerblue", method = "loess", size = 1) model10 <- lm(qrs_duration_paced ~ qrs_duration_inhibition) summary(model10)

I have this answer: Residuals: Min 1Q Median 3Q Max -36.739 -12.739 1.126 11.543 33.990

Coefficients: Estimate Std. Error t value Pr(>|t|)

(Intercept) 100.1798 16.8280 5.953 4.28e-07 * qrs_duration_inhibition 0.5729 0.1877 3.053 0.00388

Signif. codes: 0 ‘_**’ 0.001 ‘_’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 17.8 on 43 degrees of freedom (21 observations deleted due to missingness) Multiple R-squared: 0.1781, Adjusted R-squared: 0.159 F-statistic: 9.32 on 1 and 43 DF, p-value: 0.003878

I was wondering if "Multiple R-squared" could be interpreted was a correlation index?

Thank you, Katia

— Reply to this email directly or view it on GitHubhttps://github.com/rpietro/QoLFunctionalCapacityAVBlock/issues/8.

rpietro commented 12 years ago

will it be necessary to add a characteristic/name to every variable that is included in the dataset to make use of this plot ontology? with characteristic/name I mean what kind of variable it is.

On Thu, Oct 11, 2012 at 6:23 PM, Ricardo Pietrobon rpietro@duke.edu wrote:

short answer is no. r squared is a measure of how much the model explains and how much is left for chance. so, if the value is, say 0.35, then the variables in the model explain 35% of what can happen in terms of the outcome, 65% being left to chance. multiple r squared is that same measure, but taking into account the number of variables in the model. the more variables you include in the model, the more penalized your r square will be. some links:

http://goo.gl/hMdNF http://goo.gl/2gsiS http://goo.gl/lrzOa http://goo.gl/wG6z1

Joao, Mathias, Jacson, and Jose Eduardo -- as you guys know, this is the kind of information that would go into our toolbox documentation, each data analysis method being described in terms of an input (what variables are required to run the method), an output (what gets out of the method and how it is interpreted), and an algorithm (whatever might turn an input into an output). now, the future of the plot ontology (Mathias, this is a system that will tell you that if you have variable x, y and z, your alternatives for graphically displaying those variables are a, b, and c) is to have characteristics of not only plots but any type of data analysis method. the basic framework for the ontology section describing statistical methods will be the toolbox. in other words, we will have ontology classes describing input, algorithm, and output

On Thu, Oct 11, 2012 at 11:25 AM, katiasilva notifications@github.comwrote:

When I use this model bellow:

qplot(qrs_duration_paced, qrs_duration_inhibition) + geom_smooth(fill="cornflowerblue", method = "loess", size = 1) model10 <- lm(qrs_duration_paced ~ qrs_duration_inhibition) summary(model10)

I have this answer: Residuals: Min 1Q Median 3Q Max -36.739 -12.739 1.126 11.543 33.990

Coefficients: Estimate Std. Error t value Pr(>|t|)

(Intercept) 100.1798 16.8280 5.953 4.28e-07 * qrs_duration_inhibition 0.5729 0.1877 3.053 0.00388

Signif. codes: 0 ‘_**’ 0.001 ‘_’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 17.8 on 43 degrees of freedom (21 observations deleted due to missingness) Multiple R-squared: 0.1781, Adjusted R-squared: 0.159 F-statistic: 9.32 on 1 and 43 DF, p-value: 0.003878

I was wondering if "Multiple R-squared" could be interpreted was a correlation index?

Thank you, Katia

— Reply to this email directly or view it on GitHubhttps://github.com/rpietro/QoLFunctionalCapacityAVBlock/issues/8.

rpietro commented 12 years ago

we would have to know the class (factor, numeric, ...), but this is far down the line. first step is to add the class by hand

On Sun, Oct 14, 2012 at 3:27 PM, Mathias Worni mathiasworni@gmail.comwrote:

will it be necessary to add a characteristic/name to every variable that is included in the dataset to make use of this plot ontology? with characteristic/name I mean what kind of variable it is.

On Thu, Oct 11, 2012 at 6:23 PM, Ricardo Pietrobon rpietro@duke.eduwrote:

short answer is no. r squared is a measure of how much the model explains and how much is left for chance. so, if the value is, say 0.35, then the variables in the model explain 35% of what can happen in terms of the outcome, 65% being left to chance. multiple r squared is that same measure, but taking into account the number of variables in the model. the more variables you include in the model, the more penalized your r square will be. some links:

http://goo.gl/hMdNF http://goo.gl/2gsiS http://goo.gl/lrzOa http://goo.gl/wG6z1

Joao, Mathias, Jacson, and Jose Eduardo -- as you guys know, this is the kind of information that would go into our toolbox documentation, each data analysis method being described in terms of an input (what variables are required to run the method), an output (what gets out of the method and how it is interpreted), and an algorithm (whatever might turn an input into an output). now, the future of the plot ontology (Mathias, this is a system that will tell you that if you have variable x, y and z, your alternatives for graphically displaying those variables are a, b, and c) is to have characteristics of not only plots but any type of data analysis method. the basic framework for the ontology section describing statistical methods will be the toolbox. in other words, we will have ontology classes describing input, algorithm, and output

On Thu, Oct 11, 2012 at 11:25 AM, katiasilva notifications@github.comwrote:

When I use this model bellow:

qplot(qrs_duration_paced, qrs_duration_inhibition) + geom_smooth(fill="cornflowerblue", method = "loess", size = 1) model10 <- lm(qrs_duration_paced ~ qrs_duration_inhibition) summary(model10)

I have this answer: Residuals: Min 1Q Median 3Q Max -36.739 -12.739 1.126 11.543 33.990

Coefficients: Estimate Std. Error t value Pr(>|t|)

(Intercept) 100.1798 16.8280 5.953 4.28e-07 * qrs_duration_inhibition 0.5729 0.1877 3.053 0.00388

Signif. codes: 0 ‘_**’ 0.001 ‘_’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 17.8 on 43 degrees of freedom (21 observations deleted due to missingness) Multiple R-squared: 0.1781, Adjusted R-squared: 0.159 F-statistic: 9.32 on 1 and 43 DF, p-value: 0.003878

I was wondering if "Multiple R-squared" could be interpreted was a correlation index?

Thank you, Katia

— Reply to this email directly or view it on GitHubhttps://github.com/rpietro/QoLFunctionalCapacityAVBlock/issues/8.