Open MichaelChirico opened 4 years ago
There is not enough information in the uploaded output.
What was the original data, which commands were executed, and which oblique rotation was performed?
##################################################################################
##################################################################################
n = 250
J = 10
Q = 2
S = matrix(c(1,.4,.4,1),2,2)
A = matrix(c(rep(c(.7,0),each=J/2),rep(c(0,.7),each=J/2)),J,Q)
###R is the resulting correlation matrix showing a positive manifold
R=A%*%S%*%t(A);diag(R)=1;R
Y=matrix(rnorm(n*J),n,J)%*%chol(R);
Y=sweep(sweep(Y,2,apply(Y,2,mean)),2,apply(Y,2,sd),"/")
###GPA for wide list of rotations: same issue would pop up for geominQ, promax or any other oblique rotation
library(GPArotation)
EFA <-factanal(Y,Q,rotation="oblimin",scores = "Bartlett")
EFA
###It's merely a print issue, because model-implied statistics are still correct and not affected by misprint of the correlation sign
EFA$correlation
cor(EFA$scores)
OK, I can reproduce it. I don't think it is random, nor misprint, though.
In stats:::factanal we have
if (rotation != "none") {
rot <- do.call(rotation, c(list(load), cn$rotate))
load <- if (is.list(rot)) {
load <- rot$loadings
fit$rotmat <- if (inherits(rot, "GPArotation"))
t(solve(rot$Th))
else rot$rotmat
rot$loadings
}
else rot
}
fit$loadings <- sortLoadings(load)
with sortLoadings being
function (Lambda) { cn <- colnames(Lambda) Phi <- attr(Lambda, "covariance") ssq <- apply(Lambda, 2L, function(x) -sum(x^2)) Lambda <- Lambda[, order(ssq), drop = FALSE] colnames(Lambda) <- cn neg <- colSums(Lambda) < 0 Lambda[, neg] <- -Lambda[, neg] if (!is.null(Phi)) { unit <- ifelse(neg, -1, 1) attr(Lambda, "covariance") <- unit %% Phi[order(ssq), order(ssq)] %% unit } Lambda }
and I believe that the issue is that sortLoadings can switch the signs on and order of factors without modifying the rotation matrix, hence getting the loadings out of sync with the correlations.
A potential fix is to sort before rotating, but I am unsure whether that defeats the purpose of sortLoadings.
There's more:
sortLoadings appears to keep track of
attr(Lambda, "covariance")
but as far as I can tell, this attribute is only ever set if it was set already! Instead, print.factanal uses $rotmat, which is not tracked.
(In reply to Peter Dalgaard from comment #4)
There's more:
sortLoadings appears to keep track of
attr(Lambda, "covariance")
but as far as I can tell, this attribute is only ever set if it was set
already! Instead, print.factanal uses $rotmat, which is _not_ tracked.
Not sure whethere I understand correctly, but the part on
attr(Lambda, "covariance") <- unit %% Phi[order(ssq), order(ssq)] %% unit
seems to be intended to "correct" the correlation signs, but it tries to store this in an attribute that does not exist?
Yup, that's what it looks like, but at this point I'd like someone who knows their way around the code to take a look.
The rotmat business was the content of
---- r54989 | ripley | 2011-03-24 11:38:05 +0100 (Thu, 24 Mar 2011) | 1 line
fulfill wish of PR#12754 ----
and the rotate-before-sort issue seems to have been present in the original contributed code.
Incidentally, the obvious workaround for now is to just run the analysis with an orthogonal rotation (e.g. none) and then oblimin(loadings(EFA)).
This issue https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16391
discusses the same code block. According to the report when using oblique rotations an incorrect rotmat is returned because theta from GPARotate is transformed when it shouldn't be. Perhaps that is part of this issue also.
Created attachment 1621 [details] SampleOutput: Misprint & Regular
For any oblique rotation factanal, the sign of the interfactor correlations being printed is sometimes wrong, and yes REGARDLESS of any additional rotational freedom (scale/sign inversion of factors)
I have added a .txt file with sample output to illustrate what I mean and will summarize the issue below:
EXAMPLE DATA
Factanal results can be one of two possibilities: The correct result OR a MISPRINT
MISPRINT 1) ALL defining factor loadings are positive BUT negative interfactor correlation => this is clearly impossible as some of the model predicted correlations would become negative! 2) Funnily enough the model-implied correlations reported by factanal are correct (so all positive) and also the correlation between factor score estimates is correct
=> so it seems that the issue is merely with the print.method :)
METADATA