Open montenegrina opened 4 years ago
Yes you can just not do this step and rename your files and continue.
yes but I would like to QC for relatedness so I was wondering how to do the whole step in Plink because I don't have GCTA.
Ah OK, in that case you can first prune your data and then do --genome in PLINK and then filter out one random selected sample from each pair that has PIHAT >0.125
Can you please tell me if that is what you mean? And how I would extract at random selected samples with PI_HAT>0.125
plink --bfile outputZ --indep-pairwise 100 25 0.2 plink --bfile outputZ --extract plink.prune.in --make-bed --out outputZ1 plink --bfile outputZ1 --genome --max 0.125 --make-bed --out outputZ2
a=read.table("outputZ2.genome", header=T)
head(a) FID1 IID1 FID2 IID2 RT EZ Z0 Z1 Z2 PI_HAT PHE DST PPC 1 fam0110 G110 fam0113 G113 UN NA 0.9733 0 0.0267 0.0267 -1 0.807353 0.3533 2 fam0110 G110 fam0114 G114 UN NA 1.0000 0 0.0000 0.0000 -1 0.807687 0.1310 3 fam0110 G110 fam0114 G115 UN NA 1.0000 0 0.0000 0.0000 -1 0.808148 0.0327 4 fam0110 G110 fam0114 G116 UN NA 0.9787 0 0.0213 0.0213 -1 0.806944 0.1706 5 fam0110 G110 fam0117 G117 UN NA 1.0000 0 0.0000 0.0000 -1 0.808925 0.1715 6 fam0110 G110 fam0118 G118 UN NA 0.9958 0 0.0042 0.0042 -1 0.804596 0.7876 RATIO 1 1.9736 2 1.9226 3 1.8749 4 1.9344 5 1.9348 6 2.0573
On Mon, Jun 22, 2020 at 1:37 PM CornelisB notifications@github.com wrote:
Ah OK, in that case you can first prune your data and then do --genome in PLINK and then filter out one random selected sample from each pair that has PIHAT >0.125
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/neurogenetics/GWAS-pipeline/issues/3#issuecomment-647702338, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACF3RTCWXB7XJC3VCLO73CDRX6QF3ANCNFSM4OCGYVAQ .
is there is any difference in doing the above and this? plink2 --bfile outputZ10 --king-cutoff 0.088 plink2 --bfile outputZ10 --remove plink2.king.cutoff.out.id --make-bed --out outputZ11 as I understand this would remove the first and 2nd degree relatives but I don't understand how is this related to Pi_hat number. Can you please give me an explanation on what is relationship between --king-cutoff 0.088 and Pi_hat?
On Mon, Jun 22, 2020 at 2:03 PM Ana Marija sokovic.anamarija@gmail.com wrote:
Can you please tell me if that is what you mean? And how I would extract at random selected samples with PI_HAT>0.125
plink --bfile outputZ --indep-pairwise 100 25 0.2 plink --bfile outputZ --extract plink.prune.in --make-bed --out outputZ1 plink --bfile outputZ1 --genome --max 0.125 --make-bed --out outputZ2
a=read.table("outputZ2.genome", header=T)
head(a) FID1 IID1 FID2 IID2 RT EZ Z0 Z1 Z2 PI_HAT PHE DST PPC 1 fam0110 G110 fam0113 G113 UN NA 0.9733 0 0.0267 0.0267 -1 0.807353 0.3533 2 fam0110 G110 fam0114 G114 UN NA 1.0000 0 0.0000 0.0000 -1 0.807687 0.1310 3 fam0110 G110 fam0114 G115 UN NA 1.0000 0 0.0000 0.0000 -1 0.808148 0.0327 4 fam0110 G110 fam0114 G116 UN NA 0.9787 0 0.0213 0.0213 -1 0.806944 0.1706 5 fam0110 G110 fam0117 G117 UN NA 1.0000 0 0.0000 0.0000 -1 0.808925 0.1715 6 fam0110 G110 fam0118 G118 UN NA 0.9958 0 0.0042 0.0042 -1 0.804596 0.7876 RATIO 1 1.9736 2 1.9226 3 1.8749 4 1.9344 5 1.9348 6 2.0573
On Mon, Jun 22, 2020 at 1:37 PM CornelisB notifications@github.com wrote:
Ah OK, in that case you can first prune your data and then do --genome in PLINK and then filter out one random selected sample from each pair that has PIHAT >0.125
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/neurogenetics/GWAS-pipeline/issues/3#issuecomment-647702338, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACF3RTCWXB7XJC3VCLO73CDRX6QF3ANCNFSM4OCGYVAQ .
I would update to this: plink --bfile outputZ1 --genome --min 0.125 --make-bed --out outputZ2 Its better to use min than max PIHAT column can be interpret like 0.5 = first degree, 0.25 second degree etc.
Using --king-cutoff should be more or less equivalent to gcta or the pihat approach
Thanks! Also one more thing I don't see here any QC steps after imputation. Do you just recommend removing SNPs with low imputation score or?
You can follow something like this => https://github.com/neurogenetics/GWAS-pipeline#regions-file
Thanks! are these parameters:
MAF >= 0.001 & Rsq >= 0.30
usually recommended for data imputed on Minimac4?
I set Rsq >= 0.30 as setting for my imputation parameters, do I need to set it again on imputed data?
Also I was thinking to do these steps on my imputed data, can you please let me know what you think?
plink --vcf chr1.dose.vcf.gz --biallelic-only --make-bed --double-id --out s1 plink --bfile s1 --bmerge s1 --merge-mode 6 plink --bfile s1 --exclude plink.missnp --make-bed --out s2 plink --bfile s2 --list-duplicate-vars plink --bfile s2 --exclude plink.dupvar --make-bed --out s3 plink --bfile s3 --qual-scores chr1.info 7 1 1 --qual-threshold 0.8 --make-bed --out s4 plink --bfile s4 --maf 0.01 --hwe 1e-7 --snps-only --make-bed --out s5 plink --bfile s5 --geno 0.1 --mind 0.05 --make-bed --out FINAL_DATA_QC1
Thanks
Ana
On Mon, Jul 6, 2020 at 11:13 AM CornelisB notifications@github.com wrote:
You can follow something like this => https://github.com/neurogenetics/GWAS-pipeline#regions-file
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/neurogenetics/GWAS-pipeline/issues/3#issuecomment-654331604, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACF3RTGHVJ354SMETZHW6L3R2HZ25ANCNFSM4OCGYVAQ .
Ah if already pre-filtered then you should be good to go... dont think you need to filter again for duplicated because variant names should be now in this format CHR:BP:REF:ALT. If you want you can filter for R2 0.8, but would recommend doing GWAS on dosages rather than PLINK rounded calls though
Well I did QC my data prior imputation. (all steps you are mentioning on your page) but my question is POST imputation do I need to do any QC steps (is there is any evidence for that) aside setting R2 0.8? You are mentioning on your page: MAF >= 0.001 & Rsq >= 0.30 why MAF of this values?
On Mon, Jul 6, 2020 at 11:33 AM CornelisB notifications@github.com wrote:
Ah if already pre-filtered then you should be good to go... dont think you need to filter again for duplicated because variant names should be now in this format CHR:BP:REF:ALT. If you want you can filter for R2 0.8, but would recommend doing GWAS on dosages rather than PLINK rounded calls though
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
MAF is usually good prior to GWAS, because depending on your sample size, you likely wont have power for anything under MAF 1-5% so its just slower to include also all variants with lower MAFs
Thanks for that. So in conclusion if I did a very detailed QC prior imputation the only step I need to do after the imputation is to remove SNPs with imputation scores less than say 0.8?
On Tue, Jul 7, 2020 at 5:28 PM CornelisB notifications@github.com wrote:
MAF is usually good prior to GWAS, because depending on your sample size, you likely wont have power for anything under MAF 1-5% so its just slower to include also all variants with lower MAFs
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/neurogenetics/GWAS-pipeline/issues/3#issuecomment-655169226, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACF3RTAW4UYTCWDOBHRLUGTR2OOPPANCNFSM4OCGYVAQ .
You could throw another HWE just to be on the save side. If you are looking for a more automated workflow you can check this here => https://github.com/GP2code/GWAS
Thanks! I am not looking so much for automated workflow but more for general recommendations on how to proceed with GWAS, like what are the latest "trends"
On Thu, Jul 9, 2020 at 9:35 AM CornelisB notifications@github.com wrote:
You could throw another HWE just to be on the save side. If you are looking for a more automated workflow you can check this here => https://github.com/GP2code/GWAS
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/neurogenetics/GWAS-pipeline/issues/3#issuecomment-656165486, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACF3RTEQKX362APBJLAWG3TR2XIUXANCNFSM4OCGYVAQ .
Hello,
is there is a way to do this whole step with removing related individuals in Plink and how it would look like?
gcta --bfile $FILENAME --make-grm --out GRM_matrix --autosome --maf 0.05 gcta --grm-cutoff 0.125 --grm GRM_matrix --out GRM_matrix_0125 --make-grm plink --bfile $FILENAME --keep GRM_matrix_0125.grm.id --make-bed --out $FILENAME_relatedness
Thanks Ana