dougspeed / LDAK

Other
12 stars 1 forks source link

Error Encountered: "None of the 1108 individuals have phenotypes and their mother in the data" #9

Open huijie615 opened 1 month ago

huijie615 commented 1 month ago

Hi Doug, I'm currently using LDAK (Version 6) for my analysis, and I've encountered an error that I'm struggling to resolve. Here are the details of the issue:

-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
LDAK - Software for obtaining Linkage Disequilibrium Adjusted Kinships and Loads More
Version 6 - Help pages at http://www.ldak.org
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
There are 6 pairs of arguments:
--linear 
--duos MOTHERS
--pheno 
--bfile 
--covar
--max-threads 8
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --

Performing mother-offspring linear regression

Consider using "--top-preds" to include (strongly-associated) predictors as extra covariates

Will compute standard test statistics; use "--spa-test YES" to switch to a saddlepoint approximation, 

-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --

Reading IDs for 1108 samples from ./data_filtered.fam

Reading details for 8514610 predictors from ./data_filtered.bim
Warning, Predictor chr1:98618:INDEL has multi-character alleles (T and TTGAC) and will be ignored (to instead retain this predictor add "--allow-multi YES")
Warning, Predictor chr1:733014:INDEL has multi-character alleles (A and AG) and will be ignored (to instead retain this predictor add "--allow-multi YES")
Warning, Predictor chr1:766399:INDEL has multi-character alleles (G and GAATA) and will be ignored (to instead retain this predictor add "--allow-multi YES")
Warning, Predictor chr1:784474:INDEL has multi-character alleles (C and CAG) and will be ignored (to instead retain this predictor add "--allow-multi YES")
Warning, Predictor chr1:805514:INDEL has multi-character alleles (A and AC) and will be ignored (to instead retain this predictor add "--allow-multi YES")
In total 480692 predictors have multi-character alleles

Data contain 1108 samples and 8514610 predictors; will be using 1108 and 8033918

Searching for offspring-mother pairs (using parental IDs from the 3rd and 4th columns of ./data_filtered.fam

There are 499 samples with maternal IDs

Checking responses for 602 samples from /xxx/temp.bmi.pheno
Due to missing phenotypic values, the number of samples is reduced from 499 to 475

Error, none of the 1108 individuals have phenotypes and their mother in the data

This error seems to arise during the "Find duos" process. It indicates that none of the 1108 individuals have phenotypes and their mother in the data. However, I do have 475 offspring with phenotypes and their mothers' IDs (as recorded in the log file), so I'm unsure why this error occurs.

Could you please provide any suggestions for resolving this issue?

Best regards, Huijie

dougspeed commented 1 month ago

Hi Huijie, thanks for the question, and sorry for the problems.

Yes, the problems occur when searching for individuals with both a phenotype and their mother in the data.

LDAK first seems which individuals have a maternal ID (ie, column 4 of the fam file is NOT "0") - let's denote these individuals List1 - there are 499

Then LDAK sees which List1 individuals have phenotypes - let's denote these individuals List2 - there are 475

Lastly, LDAK sees which List2 individuals have their mother in the data. This is where LDAK finds zero. Ie, while there are 475 individuals with phenotypes, and a maternal ID, LDAK is unable to find any of the mothers for these individuals in the data. If you look at the fam file, does that sound correct?

Then please check for small typos. E.g., suppose the first two rows of the fam file are Fam1 Ind1 0 Ind2 0 0 Fam1 Ind2 0 0 0 0 Then the individual "Fam1 Ind1" has a maternal ID, and their mother is called "Fam1 Ind2", who is IN the data

However, if the rows were Fam1 Ind1 0 ind2 0 0 Fam1 Ind2 0 0 0 0

or Fam1 Ind1 0 Fam1_Ind2 0 0 Fam1 Ind2 0 0 0 0

etc, then LDAK can not find the mother

Thanks