Open cdiazmun opened 3 years ago
Could you give an example use of the loadings.cutoff
option that you are proposing? Would you like to submit a pull request? The related code is currently in this file: https://github.com/sinhrks/ggfortify/blob/master/R/fortify_stats.R
Following the example you use to illustrate your package:
autoplot(prcomp(df), data = iris, colour = "Species",
loadings = TRUE, loadings.colour = 'blue',
loadings.label = TRUE, loadings.label.size = 3)
If you do: print(prcomp(df))
you get a list with the loadings for the PCA list:
Standard deviations (1, .., p=4): [1] 2.0562689 0.4926162 0.2796596 0.1543862
Rotation (n x k) = (4 x 4): PC1 PC2 PC3 PC4 Sepal.Length 0.36138659 -0.65658877 0.58202985 0.3154872 Sepal.Width -0.08452251 -0.73016143 -0.59791083 -0.3197231 Petal.Length 0.85667061 0.17337266 -0.07623608 -0.4798390 Petal.Width 0.35828920 0.07548102 -0.54583143 0.7536574
Then with a cutoff option, you could select those above a threshold [absolute 0.7 for instance (to take loadings above 0.7 or below -0.7)] in PC1 and PC2, which are the ones you want to plot:
autoplot(prcomp(df), data = iris, colour = "Species",
loadings = TRUE, loadings.colour = 'blue',
loadings.label = TRUE, loadings.label.size = 3,
loadings.cutoff = 0.7)
Then in the final plot you would only see Sepal.Width and Petal.Length.
Thank you! This looks very useful indeed. Would you like to submit changes to support this feature?
I'm very sorry, but I'm not very familiar with GitHub, so I actually don't know how to do that. And neither how to submit a pull request, although I have the feeling is the same thing haha. I will read the guide I try it soon.
Okay great. I won't have time to get to this soon so feel free to give it a try!
Hi @terrytangyuan, has anyone made progress in implementing this yet (or a workaround) ? I'd certainly be interested as a side project.
Nope. Go ahead
Hello!
First, thank you for developing the package, it has been very useful.
I actually open an issue to request (if possible) a new feature at plotting the factor loadings in a PCA. There are already nice aesthetic options for the loadings. However, I would be interested on setting a loadings.cutoff option to select the desired ones. When working with PCAs based on many variables (50 in my case) it can become very messy even when playing with sizes and all. Furthermore, there are some factors that I may not be interested on, because they don't explain any variance in the samples, so it's also a nice feature to filter-out some factors.
Thank you in advance.
Regards, Cristian