fastlmm / FaST-LMM

Python version of Factored Spectrally Transformed Linear Mixed Models
https://fastlmm.github.io/
Apache License 2.0
47 stars 11 forks source link

LMM.getPosteriorWeights() returns inconsitent results #12

Closed remomomo closed 3 years ago

remomomo commented 3 years ago

Hi,

I'm using fastlmm.inference.lmm.LMM to fit mixed models with no background kernel.

After fitting the model, I want to get the posterior weights of the SNPs in the kernel. I do this with LMM.getPosteriorWeights. I've noticed that getPostiorWeights returns inconsistent results, depending on how it's parameterized.

Here is a full reproducible example:

import fastlmm
import numpy as np
from numpy.linalg import cholesky
from sklearn.metrics.pairwise import rbf_kernel
from fastlmm.inference.lmm import LMM
from matplotlib import pyplot as plt

n_snp = 10

np.random.seed(1)

# fixed effects
X = np.random.randn(1000,5)

snps = np.random.randn(1000,n_snp) # 10 SNPs

pos = np.linspace(0,n_snp,num=10)
R = rbf_kernel(pos[:,np.newaxis], gamma=-1*np.log(0.5)/(2**2))

plt.imshow(R) # correlation matrix of SNPs

R /= np.max(R)

L = cholesky(R + np.eye(len(R))*1e-7)

# correlate the SNPs
snps = L.dot(snps.T)

snps.shape

snps = snps.T

plt.scatter(x=snps[:,0],y=snps[:,1])

plt.imshow(np.corrcoef(snps.T))

w = np.random.randn(n_snp) # single variant weights
w[0:7] = 0. 
w # 3 causal SNPs and 7 noise SNPs

b = np.array([0.1, 0.01, 0.5, -0.15, 0.05]) # true beta

# phenotype is a function of fixed effects X, random effects (snps) + noise
y = X.dot(b[:,np.newaxis])
y += snps.dot(w[:,np.newaxis])
y[:,0] += np.random.randn(1000)  # noise

lmm = LMM()
lmm.setG(snps)
lmm.setX(X)
lmm.sety(y[:,0])

lik = lmm.findH2()
lik

plt.scatter(b, lik['beta']) # fixed effects are accurately estimated

# get the weights of the snps in the kernel
beta_random = lmm.getPosteriorWeights(lik['beta'], h2=lik['h2'])

plt.scatter(w, beta_random) # far away from the real weights

logdelta = lmm.find_log_delta(n_snp)
logdelta

beta_random_2 = lmm.getPosteriorWeights(logdelta['beta'], logdelta=logdelta['log_delta'])

plt.scatter(w, beta_random_2) # much closer to the real weights

I now use the second approach with logdelta instead of h2, because these tests show that the estimated weights are much closer to the true weights.

My question is now: am I doing something wrong, or is this a bug?

Also, I would have liked to get the posterioir variances as well. The documentation states that this can be triggered by setting the flag returnVar = True. However, this option is not actually an argument to the function.

The documentation regarding what is returned is wrong as well. It states that the function should return a dictionary: " weights : [k0+k1] 1-dimensional array of predicted phenotype values". However, it simply returns a numpy array of weights.

CarlKCarlK commented 3 years ago

remomomo,

Thanks for using FaST-LMM!

I should be able to look into this today.

Yours, Carl

Carl Kadie, Ph.D. FaST-LMM & PySnpTools Teamhttps://fastlmm.github.io/ (Microsoft Research, retired) https://www.linkedin.com/in/carlk/

Join the FaST-LMM user discussion and announcement list via emailmailto:fastlmm-user-join@python.org?subject=Subscribe (or use web sign uphttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.python.org%2Fmailman3%2Flists%2Ffastlmm-user.python.org&data=02%7C01%7C%7C13a5c33d7cd84cad5cdf08d7bba56e20%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637184191498409587&sdata=2CQWjQEwOpQol2rQ1eoyVTgY8WvInV8UH31Wtl68FzY%3D&reserved=0)

From: remomomo notifications@github.com Sent: Wednesday, January 20, 2021 2:12 AM To: fastlmm/FaST-LMM FaST-LMM@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: Re: [fastlmm/FaST-LMM] LMM.getPosteriorWeights() returns inconsitent results (#12)

replace sns.scatterplot by plt.scatterplot in the example above. Sorry for the error!

- You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ffastlmm%2FFaST-LMM%2Fissues%2F12%23issuecomment-763494516&data=04%7C01%7C%7Ca84cac7f32374201ccdd08d8bd2bc702%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637467343000767367%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Q0Ika3aalHHCuz4rQwe11WNWPhmvl5mnxlgNH7iiwtQ%3D&reserved=0, or unsubscribehttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FABR65PZWNGPACLAGPERFEHDS22T5VANCNFSM4WKNQZHQ&data=04%7C01%7C%7Ca84cac7f32374201ccdd08d8bd2bc702%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637467343000777322%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=F1cVXchmXDTP%2FuQnqi5ZUwsknFsU95xkpo5NR%2Baw3G4%3D&reserved=0.

CarlKCarlK commented 3 years ago

remomomo,

I'm sorry I don't have a good handle on using the inference.lmm.LMM class directly. It's an internal class and not documented externally. Here are some ideas:

Thanks again for using FaST-LMM,

From: Carl KADIE Sent: Wednesday, January 20, 2021 7:26 AM To: fastlmm/FaST-LMM reply@reply.github.com; fastlmm/FaST-LMM FaST-LMM@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: RE: [fastlmm/FaST-LMM] LMM.getPosteriorWeights() returns inconsitent results (#12)

remomomo,

Thanks for using FaST-LMM!

I should be able to look into this today.

Yours, Carl

Carl Kadie, Ph.D. FaST-LMM & PySnpTools Teamhttps://fastlmm.github.io/ (Microsoft Research, retired) https://www.linkedin.com/in/carlk/

Join the FaST-LMM user discussion and announcement list via emailmailto:fastlmm-user-join@python.org?subject=Subscribe (or use web sign uphttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.python.org%2Fmailman3%2Flists%2Ffastlmm-user.python.org&data=02%7C01%7C%7C13a5c33d7cd84cad5cdf08d7bba56e20%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637184191498409587&sdata=2CQWjQEwOpQol2rQ1eoyVTgY8WvInV8UH31Wtl68FzY%3D&reserved=0)

From: remomomo notifications@github.com<mailto:notifications@github.com> Sent: Wednesday, January 20, 2021 2:12 AM To: fastlmm/FaST-LMM FaST-LMM@noreply.github.com<mailto:FaST-LMM@noreply.github.com> Cc: Subscribed subscribed@noreply.github.com<mailto:subscribed@noreply.github.com> Subject: Re: [fastlmm/FaST-LMM] LMM.getPosteriorWeights() returns inconsitent results (#12)

replace sns.scatterplot by plt.scatterplot in the example above. Sorry for the error!

- You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ffastlmm%2FFaST-LMM%2Fissues%2F12%23issuecomment-763494516&data=04%7C01%7C%7Ca84cac7f32374201ccdd08d8bd2bc702%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637467343000767367%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Q0Ika3aalHHCuz4rQwe11WNWPhmvl5mnxlgNH7iiwtQ%3D&reserved=0, or unsubscribehttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FABR65PZWNGPACLAGPERFEHDS22T5VANCNFSM4WKNQZHQ&data=04%7C01%7C%7Ca84cac7f32374201ccdd08d8bd2bc702%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637467343000777322%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=F1cVXchmXDTP%2FuQnqi5ZUwsknFsU95xkpo5NR%2Baw3G4%3D&reserved=0.

remomomo commented 3 years ago

Hi Carl,

Thanks for looking in to it and thanks for your suggestions!

I use the LMM class to calculate log likelihoods for a custom implementation of likelihood ratio tests. I also stumbled across lmm_cov yesterday... but it unfortunately doesn't have the function that returns posterior weights. I'm working with Christoph actually! I'll see if he remembers anything about these things.

cheers