jasp-stats / jasp-issues

This repository is solely meant for reporting of bugs, feature requests and other issues in JASP.
55 stars 29 forks source link

(Incorrect) Negative Eigenvalues in JASP EFA Scree plot? #685

Closed KenMavor closed 1 year ago

KenMavor commented 4 years ago

Steps to reproduce:

The attached file shows the selected variables and the EFA output using Principal Axis Factoring, and Oblimin rotation, plus I also show for comparison the same analysis in PCA

The data set and JASP analysis are in the attached file.

NOTE: This issue seems similar to the previous issue posted about the axis being wrong in the Scree plot: https://github.com/jasp-stats/jasp-issues/issues/556 which claims to be resolved but clearly is not.

What raised the concern for me was that some eigenvalues in the EFA scree plot seemed to be negative. I know this is not true as I generated the data myself.

The previous poster felt that this might be a problem with the labelling of the Y-axis in the scree plot since in the documentation for the parallel analysis it apparently talks about estimated communalities. This is a red-herring I think. Communalities will not exceed 1. Eigenvalues range from 0 (in a standard positive-definite matrix) to the number of variables in the analysis.

The easiest way to see that this is a real problem is to compare the Scree plots from the two almost identical procedures. In the attached file, if you look at the scree plot from the PCA you will see that is correctly shows the eigenvalues dropping dramatically after the third one and tailing off toward zero. In the EFA model, the same plot shows the eigenvalues trailing down into the negative range, which is clearly wrong.

So I suspect your earlier poster was correct and the problem may just be that the axis is incorrect.

I am not an R user so I cannot comment on what the author of the psych package says about eigenvalues and estimated communalities, but I think vankesterin was wrong in their interpretation of this. Eigenvalues and communalities are not the same and would not be plotted on the same axis scale. Based on the evidence supplied in the earlier post where the plot changed from an earlier version (correct) to a later version with the error, this does seem more likely to be an error in the plotting of the graph.

Since it is plotted correctly in the PCA module, hopefully it is easy to fix it also in the EFA module?

Ken.

boxes2 For GitHub.jasp.zip

KenMavor commented 4 years ago

I am not an R user so I cannot comment on what the author of the psych package says about eigenvalues and estimated communalities, but I think vankesterin was wrong in their interpretation of this. Eigenvalues and communalities are not the same and would not be plotted on the same axis scale.

Actually, further to this point, I have now read the documentation of the psych package for the parallel analysis. I can see where the confusion has arisen. It is true that the parallel analysis generates (extracts) eigenvalues based on different assumptions about the communalities. However, these eigenvalues would still all be >0. (And the axis is definitely eigenvalues, not communalities!) What is being reported in the table are data eigenvalues (and randomly estimated eigenvalues). (Though in any case communalities should not be less than zero either).

BTW also forgot to say earlier that I think JASP is great and my motivation is to fix these things so I can make more use of it in teaching Factor Analysis.

Best regards,

Ken.

EJWagenmakers commented 4 years ago

@vankesteren, would be nice to get to the bottom of this before our next version

vankesteren commented 4 years ago

Dear Ken,

Thank you for the detailed report, I will investigate this week!

KenMavor commented 4 years ago

Dear Eric-Jan,

Thank you for your quick attention. I was discussing the issue further with a local colleague (Justin Ales) and we can see that there is some ambiguity in the description of the parallel analysis in the psych package which has perhaps led to this situation. He also found a very useful article by Dinno (See http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.168.2473&rep=rep1&type=pdf), which could help explain what is going on here.

As a long-term user and teacher of factor analysis, I think that it is important to be clear about what is being presented in the graph, but also to help researchers who are used to certain common practices. If we can resolve the ambiguity about what the psych package is doing here then perhaps some better terminology on the graph (“Extraction Eigenvalues” or something...) but also to perhaps provide an option for the user to get the PCA version of the graph even if they are doing EFA. In my experience when researchers use the Kaiser criterion (Eigenvalue >1) they usually apply this to the PCA Eigenvalues, even when they are using an EFA type extraction because the PCA eigenvalues have a clear interpretation with each variable contributing 1 chunk of variance to the matrix to get redistributed). With EFA, the ambiguity arises about which values to use as communalities (SMC’s or final estimates?) when producing the Eigenvalues. So even though one could use the communality-based eigenvalues and compare them to 0 (or the simulated data line in the parallel analysis) instead of using the PCA eigenvalues and comparing them to 1, the latter is easier to understand and interpret.

So, once the ambiguity is resolved about what the psych package is actually doing here, a full solution might involve offering the additional graph from the PCA as a version of the scree plot that more researchers would actually be used to, and expecting to see… and then perhaps some footnotes on the output to explain what they show in a less ambiguous way. (Of course they could go to the PCA module to get that graph, but it would be a simple and useful addition to have it directly available in the EFA module directly).

Happy to help however we can to resolve the uncertainty and make things clear to users!

BTW, as a teacher, I am a big fan of the option that is there to show the path model. I teach regression, mediation, and factor analysis using graphical path model notation, so being able to show the nicely drawn path models of EFA solutions is really nice and a feature I plan to incorporate into my teaching. So the efforts on the EFA module so far are already very much appreciated.

Ken.

On 19 Apr 2020, at 11:16, Erik-Jan van Kesteren notifications@github.com<mailto:notifications@github.com> wrote:

Dear Ken,

Thank you for the detailed report, I will investigate this week!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/jasp-stats/jasp-issues/issues/685#issuecomment-616095443, or unsubscribehttps://github.com/notifications/unsubscribe-auth/APHY5R5WVURB6LFSDL4NKMTRNLFPBANCNFSM4MLR22SA.

[ { "@context": "http://schema.org", "@type": "EmailMessage", "potentialAction": { "@type": "ViewAction", "target": "https://github.com/jasp-stats/jasp-issues/issues/685#issuecomment-616095443", "url": "https://github.com/jasp-stats/jasp-issues/issues/685#issuecomment-616095443", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { "@type": "Organization", "name": "GitHub", "url": "https://github.com" } } ]

vankesteren commented 4 years ago

Dear Ken,

I have now investigated further, with help of the paper you linked, as well as their excellent answer on stackexchange. Thank you again for raising this issue and starting this discussion -- it really helps us and hopefully future users of JASP as well.

My conclusion is that the way we present the screeplot is correct. Negative eigenvalues for EFA are very well possible. Indeed, it is confusing that for PCA the eigenvalue means something different than for EFA, but this is simply how these things are defined in the literature (as well as by the psych package).

You stated the following:

In my experience when researchers use the Kaiser criterion (Eigenvalue > 1) they usually apply this to the PCA Eigenvalues, even when they are using an EFA type extraction

This may well be true, but they should not -- they should either compare the factor analysis eigenvalues to 0, or they should use the EFA parallel analysis as so well-explained in Dinno's paper. Both of these are currently provided by JASP.

Nevertheless, this issue has come up before, and I want to do something about it. Therefore, there are 2 edits I will make to JASP for this issue:

  1. I will link to Dinno's paper in the help files to clear up any misconceptions around this point.
  2. As per your suggestion, I will change the y-axis label from "eigenvalues" to "EFA eigenvalues" to clarify that these are not the standard PCA eigenvalues
AlexanderLyNL commented 4 years ago

Would a note on this in the help file clarify things as well?

vankesteren commented 4 years ago

yes, that is point 1 of my 2-point to-do list ;)

AlexanderLyNL commented 4 years ago

I can't read :(

juliuspfadt commented 1 year ago

@KenMavor I will close this since it seems solved to me. Feels free to reopen if your issue remains.