genetics-statistics / GEMMA

Genome-wide Efficient Mixed Model Association
https://github.com/genetics-statistics/GEMMA
GNU General Public License v3.0
333 stars 125 forks source link

Null Relatedness Matrix in Gemma #104

Closed Gen-Harrison closed 7 years ago

Gen-Harrison commented 7 years ago

Hi, I want to run the BSLMM in Gemma, but include a relatedness matrix in which none of the individuals are related. To do this I created a matrix of 0s and 1s. Where 1 is relatedness to self and 0 is relatedness to other. My goal for running Gemma is to look at the PVE/PGE of cis-eQTL variants on the expression of a given gene. When I run Gemma, I get results for about 50% of the genes tested, and for the others I get this error:

/var/spool/pbs/mom_priv/jobs/6501659.egeon2.SC: line 113: 30385 Aborted /exec5/GROUP/barreiro/barreiro/barreiro_group/Programs/GEMMA/bin/gemma -bfile $LSCRATCH/plinkFile.ENSG00000115607.admix_variation_only.CTL -k 20Sept17_Kiga_Batwa_ForGemma_Final_Samples_CTL_Standardized_FAKE_Relatedness.sXX.txt -bslmm 1 -n 2984 -o gemma_output_null_relatedness/CTL/08/bslmm_admix_variation_only_CTL.2984 gsl: newton.c:74: ERROR: derivative is zero Default GSL error handler invoked.

I would appreciate any feedback or ideas on why I am getting this.

pcarbo commented 7 years ago

@Gen-Harrison It is strange that you are providing a relatedness matrix that is the identity matrix. I haven't done this before, but this could definitely cause non-identifiability issues because you essentially have two variance components in the model that are the same (both identity matrix). This may explain the error you are getting.

Note, from the manual: "GEMMA does not require the user to provide a relatedness matrix explicitly. It internally calculates and uses the centered relatedness matrix, which has the nice interpretation that each effect size βi follows a mixture of two normal distributions a priori."

You might want to try running the BSLMM model without providing a relatedness matrix.

Gen-Harrison commented 7 years ago

Thanks for the response. We launched it today without the relatedness matrix per your advice. The reason we were providing the identity matrix is that we are looking differences in a phenotype between two populations, but one of our populations has lower genetic diversity than the other and in this population many of the individuals are relative, and the phenotype we are trying to measure correlates by population. To avoid this we wanted to get a measure of PVE/PGE in which we did not consider relatedness at all in the model. This is the first time I am trying this type of analyses so I'm not completely sure it makes sense.

I'll see what the output looks like from the run we are trying now. Thanks again.

pjotrp commented 7 years ago

I think the kinship matrix should not be identity. @xiangzhou ?

xiangzhou commented 7 years ago

Yes, I agree. The kinship matrix cannot be identity -- in this case, the model is not identifiable.

pcarbo commented 7 years ago

@Gen-Harrison Given the differences in the populations you might try to run GEMMA on the two populations separately. Alternatively, you could include a covariate to indicate population-of-origin, although this would assume the genetic effects are the same in both populations, which may not be a good assumption.

For followup questions, please post to the GEMMA mailing list. Please see:

https://groups.google.com/group/gemma-discussion

(Note it is easy to unsubcribe at any point.)