Closed snowformatics closed 5 months ago
Yes, I can write a little script that computes it the same way that FaST-LMM does.
It actually computes one kinship matrix per leave-out-out-chromosome. Is that what you need or just one that uses all the chromosomes?
From: snowformatics @.> Sent: Monday, June 17, 2024 2:13 AM To: fastlmm/FaST-LMM @.> Cc: Subscribed @.***> Subject: [fastlmm/FaST-LMM] Kinship (Issue #49) Importance: High
Hi,
I it possible to extract the kinship matrix calculated by fast-lmm?
Thanks Stefanie
- Reply to this email directly, view it on GitHubhttps://github.com/fastlmm/FaST-LMM/issues/49, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABR65P74IEBZETKMQQYVKCLZH2SCBAVCNFSM6AAAAABJNTBVKWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM2TMOBSGYYTGMI. You are receiving this because you are subscribed to this thread.Message ID: @.**@.>>
Hi Carl,
I want to plot the matrix to represent the genetic relatedness of the individuals so I would need the data from all chromosomes.
Thanks Stefanie
Carl Kadie @.***> schrieb am Mo. 17. Juni 2024 um 17:27:
Yes, I can write a little script that computes it the same way that FaST-LMM does.
It actually computes one kinship matrix per leave-out-out-chromosome. Is that what you need or just one that uses all the chromosomes?
- Carl
From: snowformatics @.> Sent: Monday, June 17, 2024 2:13 AM To: fastlmm/FaST-LMM @.> Cc: Subscribed @.***> Subject: [fastlmm/FaST-LMM] Kinship (Issue #49) Importance: High
Hi,
I it possible to extract the kinship matrix calculated by fast-lmm?
Thanks Stefanie
- Reply to this email directly, view it on GitHub< https://github.com/fastlmm/FaST-LMM/issues/49>, or unsubscribe< https://github.com/notifications/unsubscribe-auth/ABR65P74IEBZETKMQQYVKCLZH2SCBAVCNFSM6AAAAABJNTBVKWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM2TMOBSGYYTGMI
. You are receiving this because you are subscribed to this thread.Message ID: @.**@.>>
— Reply to this email directly, view it on GitHub https://github.com/fastlmm/FaST-LMM/issues/49#issuecomment-2173713005, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABHTXXV5ANPUD3OL37W4NE3ZH356BAVCNFSM6AAAAABJNTBVKWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNZTG4YTGMBQGU . You are receiving this because you authored the thread.Message ID: @.***>
Here you go:
from fastlmm.util import example_file # Download and return local file name
from pysnptools.snpreader import Bed, Pheno
import pysnptools.util as pstutil
from pysnptools.kernelreader import SnpKernel
from pysnptools.standardizer import Unit
bed_fn = example_file('tests/datasets/synth/all.*','*.bed')
pheno_fn = example_file("tests/datasets/synth/pheno_10_causals.txt")
cov_fn = example_file("tests/datasets/synth/cov.txt")
bed = Bed(bed_fn, count_A1=True).read()
# print dimensions of bed
print(f"Bed dimensions: {bed.iid_count} x {bed.sid_count}")
pheno = Pheno(pheno_fn).read()
# replace original bed with one that has the same iids as the pheno
# If no values are missing, this will not do anything
bed, pheno = pstutil.intersect_apply([bed, pheno])
print(f"Bed dimensions: {bed.iid_count} x {bed.sid_count}")
# sample bed values
print(f"sample bed values: \n{bed.val[0:5,0:5]}")
# standardize the bed data
# that is: Make it mean 0 and sd 1 with any missing values filled in with 0
bed_standardized = bed.standardize()
print(f"sample bed values: \n{bed.val[0:5,0:5]}")
# create kinship matrix
kernel = SnpKernel(bed_standardized, standardizer=Unit())
kernel = kernel.read().standardize() # defaults to DiagKtoN standardize
print(f"sample kernel values: \n{kernel.val[0:5,0:5]}")
# Behind the scenes this is the same as:
# kernel2 = snps * snps.T / snps.shape[1]
kernel2 = bed_standardized.val @ bed_standardized.val.T / bed_standardized.sid_count
print(f"sample kernel2 values: \n{kernel2[0:5,0:5]}")
# plot the kinship matrix
import pylab
pylab.figure(figsize=(10,8))
pylab.suptitle("kinship matrix")
pylab.imshow(kernel.val, cmap=pylab.gray())
pylab.show()
As you can see, if you know if have no missing data, etc., it can be simplifed.
Thanks for using FaST-LMM and PySnpTools! -- Carl
Thanks a lot Carl! It worked very well with no missing values, but when I have missing values and I use: bed, pheno = pstutil.intersect_apply([bed, pheno])
I get this error: AttributeError: '_SnpSubset' object has no attribute 'val'
and further: bed_standardized = bed.standardize() AttributeError: '_SnpSubset' object has no attribute 'standardize'
Double check that your pheno line includes .read()
:
pheno = Pheno(pheno_fn).read()
Let me know if that's it.
Hi Carl,
yes, this line is there.
So the problem is, if there are no missing samples:
bed = Bed(bed_fn, count_A1=True).read()
pheno = Pheno(pheno_fn).read()
print (bed.shape, pheno.shape, type(bed), type(pheno))
bed, pheno = pstutil.intersect_apply([bed, pheno])
print (bed.shape, pheno.shape, type(bed), type(pheno))
Will return class 'pysnptools.snpreader.snpdata.SnpData'
58, 385649) (58, 1) <class 'pysnptools.snpreader.snpdata.SnpData'> <class 'pysnptools.snpreader.snpdata.SnpData'> (58, 385649) (58, 1) <class 'pysnptools.snpreader.snpdata.SnpData'> <class 'pysnptools.snpreader.snpdata.SnpData'>
If samples are missing:
(200, 949174) (196, 1) <class 'pysnptools.snpreader.snpdata.SnpData'> <class 'pysnptools.snpreader.snpdata.SnpData'> (196, 949174) (196, 1) <class 'pysnptools.snpreader._subset._SnpSubset'> <class 'pysnptools.snpreader._subset._SnpSubset'>
It's returns a pysnptools.snpreader._subset._SnpSubset object.
Which somehow is not compatible with:
bed_standardized = bed.standardize() AttributeError: '_SnpSubset' object has no attribute 'standardize'
I am on Windows 10, Python 3.9 with fast-lmm 0.6.7
Thanks
Excellent debugging! Ok, I think I've got it. Please add a line after the intersect:
bed, pheno = pstutil.intersect_apply([bed, pheno])
bed = bed.read()
Ah I see! Thanks a lot Carl, your support is super fantastic 👍
Everything works now. Best regards Stefanie
You're welcome. Always fun to show off PySnpTools
.
Hi,
I it possible to extract the kinship matrix calculated by fast-lmm?
Thanks Stefanie