DReichLab / EIG

Eigen tools by Nick Patterson and Alkes Price lab
Other
178 stars 59 forks source link

smartpca eigen vector file missing individuals #83

Open victor-canta opened 1 year ago

victor-canta commented 1 year ago

Hello,

I noticed that I am missing some individuals in my eigen vector output file after running smartpca. My total number of individuals is 322 and the file only has information for 300 individuals. Is there some form of filtering that is done by the software that I am not accounting for? I also checked the .ind file to see if some may have been mis-assigned to a population with a typo but that doesn't seem to be the case. I would greatly appreciate your help.

bumblenick commented 1 year ago

Samples with no data or very little are dropped. There may be a message in the log file "insufficient data..." Almost for sure there is a data issue with your dropped samples.

Nick

On Tue, Nov 1, 2022 at 7:42 PM victor-cg1 @.***> wrote:

Hello,

I noticed that I am missing some individuals in my eigen vector output file after running smartpca. My total number of individuals is 322 and the file only has information for 300 individuals. Is there some form of filtering that is done by the software that I am not accounting for? I also checked the .ind file to see if some may have been mis-assigned to a population with a typo but that doesn't seem to be the case. I would greatly appreciate your help.

— Reply to this email directly, view it on GitHub https://github.com/DReichLab/EIG/issues/83, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEE77BZBHXLH5CVXMCXFOHTWGGTGHANCNFSM6AAAAAARURK5DQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

victor-canta commented 1 year ago

Hello Nick,

Thank you for the insight. Is there a way to see what criteria these samples failed? All of these samples already went through a filtering process (vcftools --max-missing) before they were prepared for smartpca format.

Thanks again, Victor

bumblenick commented 1 year ago

I assume there is nothing useful in the logfile I would run convertf and write out the genotypes for these 22 samples in eigenstrat format. Maybe the problem will be obvious.

N

On Wed, Nov 2, 2022 at 4:20 PM victor-cg1 @.***> wrote:

Hello Nick,

Thank you for the insight. Is there a way to see what criteria these samples failed? All of these samples already went through a filtering process (vcftools --max-missing) before they were prepared for smartpca format.

Thanks again, Victor

— Reply to this email directly, view it on GitHub https://github.com/DReichLab/EIG/issues/83#issuecomment-1301181078, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEE77B5SGTF2KUY3IUITRMLWGLEI5ANCNFSM6AAAAAARURK5DQ . You are receiving this because you commented.Message ID: @.***>