Open sbenthall opened 3 years ago
Could you explain this a bit more. I don't know what is meant by top/bottom and tags?
Is what every you describe here partially contained in the Multi-dimensional scaling of ./bigbang/examples/organizations/Full Archive Study.ipynb
Thx :-)
Each working group can be seen as a document. Consider, for each email sent to the working group, the domain of the email address sender as a word.
PCA on the set of documents will produce a set of dimensions expressed as weights on each of the email domains.
In the Multi-dimensional scaling section of that notebook, each dimension is summarized by the email domains with the highest and lowest weights.
This issue asks for an additional, alternative way of summarizing the principal components.
Given a principal component and a working group as a document, the dot product of the principal component weights and the "word" count gives that working group a scalar score.
So it is possible, for each component, to show the top five/bottom five working groups according to that score.
Is that a clear explanation?
Illustrate the principal components with top/bottom working group tags.