Open CrumpLab opened 6 years ago
The plot looks like what Walter got in his analysis, but the numbers are different:
The correlation for me was -0.3202597, R^2 = .1025
Walter, I think one issue might be that you ran the linear regression using r as the Dependent variable, and not the predictor variable. your code was:
expertise.mod1 = lm(r ~ IKSIs, data = correlations)
but try
expertise.mod1 = lm(IKSIs ~ r, data = correlations)
and see what happens
After Nick's pre-processing (issue #10 )
Call: lm(formula = IKSIs ~ r, data = correlations)
Residuals: Min 1Q Median 3Q Max -122.244 -35.472 -6.851 29.423 182.643
Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 51.06 on 344 degrees of freedom Multiple R-squared: 0.05126, Adjusted R-squared: 0.04851 F-statistic: 18.59 on 1 and 344 DF, p-value: 2.122e-05 ########### cor: -0.2264159 r^2 = 0.05126415
I'm getting similar results for this following Nick's pre-processing steps. I eliminated capital letters, rather than whole words, and get cor -.20, so pretty close
I am confused: Shouldn't r be the dependent variable and IKSI be the independent variable? "as a function of mean IKSI"
Oops, yes, I was being confusing and wrong before. If we are predicting pearson_r as a function of mean_iksi, then the formula should be reversed. But, we could also say we were predicting mean_iksi from pearson_r, and then it would stay the same. Because we only have two variables, the correlation should be the same regardless of which way we do it (so my suggestion from before that you should switch the order was pointless, because you should get the same thing no matter what).
Here's my results trying both formulas, they both give the same answers. So, right now I'm getting a correlation of -.257
cor.test(correlation_data$pearson_r,correlation_data$mean_IKSI)
Pearson's product-moment correlation
data: correlation_data$pearson_r and correlation_data$mean_IKSI t = -4.9471, df = 344, p-value = 1.181e-06 alternative hypothesis: true correlation is not equal to 0 95 percent confidence interval: -0.3535490 -0.1565399 sample estimates: cor -0.2577211
summary(lm(pearson_r ~ mean_IKSI, data = correlation_data))
Call: lm(formula = pearson_r ~ mean_IKSI, data = correlation_data)
Residuals: Min 1Q Median 3Q Max -0.44554 -0.09184 0.00596 0.10833 0.36788
Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.1521 on 344 degrees of freedom Multiple R-squared: 0.06642, Adjusted R-squared: 0.06371 F-statistic: 24.47 on 1 and 344 DF, p-value: 1.181e-06
cor.test(correlation_data$mean_IKSI,correlation_data$pearson_r,)
Pearson's product-moment correlation
data: correlation_data$mean_IKSI and correlation_data$pearson_r t = -4.9471, df = 344, p-value = 1.181e-06 alternative hypothesis: true correlation is not equal to 0 95 percent confidence interval: -0.3535490 -0.1565399 sample estimates: cor -0.2577211
summary(lm(mean_IKSI~pearson_r, data = correlation_data))
Call: lm(formula = mean_IKSI ~ pearson_r, data = correlation_data)
Residuals: Min 1Q Median 3Q Max -130.172 -30.679 -5.883 27.525 164.410
Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 44.75 on 344 degrees of freedom Multiple R-squared: 0.06642, Adjusted R-squared: 0.06371 F-statistic: 24.47 on 1 and 344 DF, p-value: 1.181e-06
Behmer & Crump (2016) used this data set to determine whether individual typists were sensitive to letter, bigram, and trigram frequency. For each typists they measured, mean IKSI for each letter, bigram, and trigram, and then correlated the letter, bigram, and trigram mean IKSIs with letter, bigram, and trigram frequencies in natural english. So, for each subject, three correlation coefficient's (using Spearman's) were obtained for the letter, bigram, and trigram levels. They also plotted these coefficients against each subjects' mean typing time, which served as a proxy for expertise.
We can do the same thing here:
Steps
1) get the correlation between H and mean IKSI for letter position and word length for each subject 2) get mean IKSI for each subject 3) plot the correlations between H and IKSI (position by length) as a function of mean IKSI 4) do a linear regression on above and report findings
Walter has done this. We can report our independent findings here, and discuss what this might mean.