bulik / ldsc

LD Score Regression (LDSC)
GNU General Public License v3.0
628 stars 339 forks source link

Estimating bivariate LDSC intercept when h2 is low #444

Open tom-a-bond opened 1 month ago

tom-a-bond commented 1 month ago

Hi,

We are wondering whether it is valid to use LDSC to estimate the bivariate intercept (to be used as an indicator of sample overlap) when trait h2 is low?

For our traits we get the following LDSC results:

Heritability of phenotype 1
---------------------------
Total Observed scale h2: 0.0031 (0.0101)
Lambda GC: 0.9634
Mean Chi^2: 0.9637
Intercept: 0.961 (0.0066)
Ratio: NA (mean chi^2 < 1)

Heritability of phenotype 2/2
-----------------------------
Total Observed scale h2: -0.0006 (0.0065)
Lambda GC: 0.8836
Mean Chi^2: 0.9061
Intercept: 0.9068 (0.0068)
Ratio: NA (mean chi^2 < 1)

Genetic Covariance
------------------
Total Observed scale gencov: 4.1236e-05 (0.0052)
Mean z1*z2: 0.198
Intercept: 0.198 (0.0043)

Genetic Correlation
-------------------
Genetic Correlation: nan (nan) (h2  out of bounds) 
Z-score: nan (nan) (h2  out of bounds)
P: nan (nan) (h2  out of bounds)
WARNING: One of the h2's was out of bounds.
This usually indicates a data-munging error or that h2 or N is low.

We also initially got the LDSC error "FloatingPointError: invalid value encountered in sqrt", apparently due to some of the jackknife h2 estimates being negative, but it appears that the genetic covariance estimate may still be fine in this case- see [https://github.com/bulik/ldsc/issues/40]. LDSC ran fine when a slightly different set of input SNPs was used. Clearly in our case h2 is so low for both traits that rG won't be meaningful. But is there any reason to worry that the intercept estimate will be biased?

Many thanks in advance!

aksarkar commented 1 month ago

If your goal is to determine whether there are overlapping samples, estimating the ldsc intercept is not a valid way to do it since sample overlap is only one reason of many why the ldsc intercept might not be equal to 1.

tom-a-bond commented 1 month ago

Thanks aksarkar- we appreciate that the interpretation of the bivariate LDSC intercept is somewhat complex. Our question is more technical though- i.e. is the bivariate LDSC intercept likely to be biased when trait h2 is low?

aksarkar commented 1 month ago

The derivation does not suggest that would happen; however, real data may be more complex than the scenarios considered in the derivation.

I do not know of simulation results that demonstrate the intercept is not biased in that regime. You might need to do one to convince yourself.