In the final step “Project remaining individuals”, am I correct in thinking that all remaining individuals are projected (i.e., the related individuals and individuals identified as outliers in the first PCA who were subsequently excluded from the final PCA calculation)? If so, should I subsequently remove participants who were previously identified as PCA outliers?
Working through the materials, I remove outliers with S > 0.275 based on the histogram of “outlierness”. Perhaps I am mistaken but after looking at other PCA methods, I had thought that I would need to remove these sample outliers from the projection of individuals onto my final PCs…
Below are a histogram and two plots of the PC scores coloured by the outlierness statistic:
I have been following your tutorial on the steps to perform PCA at https://privefl.github.io/bigsnpr/articles/bedpca.html.
In the final step “Project remaining individuals”, am I correct in thinking that all remaining individuals are projected (i.e., the related individuals and individuals identified as outliers in the first PCA who were subsequently excluded from the final PCA calculation)? If so, should I subsequently remove participants who were previously identified as PCA outliers?
Working through the materials, I remove outliers with S > 0.275 based on the histogram of “outlierness”. Perhaps I am mistaken but after looking at other PCA methods, I had thought that I would need to remove these sample outliers from the projection of individuals onto my final PCs…
Below are a histogram and two plots of the PC scores coloured by the outlierness statistic:
Thanks in advance!