Closed lisaeick closed 5 months ago
Hi lisaeick,
Thank you for your interest in POP-GWAS and for trying it out!
There are two scenarios where the current POP-GWAS may produce noisy results:
Can you run the following sensitivity analysis to verify this?
It would be great if you could also share the summary statistics with us via e-mail, so we could reproduce your results and help in investigating the issue further. However, we understand if sharing this data is not feasible.
Best, Jiacheng
Hey Jiacheng,
Thanks for your fast reply and sorry for needing much time to answer, but with our download boundaries and easter it took me some time to get the sumstats.
If you know a way to share them (since they are big) let me know. (2.2gb)
I prepared the data exclusively for popgwas and split them randomly in labeled and unlabeled so there is no selection bias. Of course there is relation within the samples, since its one cohort, but also we were sure to avoid sample overlap.
I downloaded the pure sumstats and attached the rsids, to avoid that one of my preprocessing steps might be the error. As said please let me know how you would receive them and I am happy to share the sumstats.
Best Greetings Lisa
From: Jiacheng Miao @.> Sent: 26 March 2024 16:03 To: qlu-lab/POP-TOOLS @.> Cc: Eick, Lisa @.>; Author @.> Subject: Re: [qlu-lab/POP-TOOLS] Pop-Gwas not working (Issue #1)
Hi lisaeick,
Thank you for your interest in POP-GWAS and for trying it out!
There are two scenarios where the current POP-GWAS may produce noisy results:
Can you run the following sensitivity analysis to verify this?
It would be great if you could also share the summary statistics with us, so we could reproduce your results and help in investigating the issue further. However, we understand if sharing this data is not feasible.
Best, Jiacheng
— Reply to this email directly, view it on GitHubhttps://github.com/qlu-lab/POP-TOOLS/issues/1#issuecomment-2020514605, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AQ45SVGSMBHUJSIXAS2N3R3Y2F52NAVCNFSM6AAAAABFIVQY5CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRQGUYTINRQGU. You are receiving this because you authored the thread.Message ID: @.***>
Hey Jiacheng,
Thanks for your fast reply and sorry for needing much time to answer, but with our download boundaries and easter it took me some time to get the sumstats.
If you know a way to share them (since they are big) let me know. (2.2gb)
I prepared the data exclusively for popgwas and split them randomly in labeled and unlabeled so there is no selection bias. Of course there is relation within the samples, since its one cohort, but also we were sure to avoid sample overlap.
I downloaded the pure sumstats and attached the rsids, to avoid that one of my preprocessing steps might be the error. As said please let me know how you would receive them and I am happy to share the sumstats.
Best Greetings Lisa
Hi Lisa,
Thank you for your email and for preparing the summary statistics.
Regarding the sharing of the large dataset, we can consider using cloud storage services Google Drive, Dropbox or OneDrive or Box. you can upload the data to a shared folder and provide me with a link to access it directly through e-mail (not through the Github issue). After that, I will look into POP-GWAS.
Best, Jiacheng
Hi Jiacheng,
Can you please provide an email, other then this github thread so I can share the one Drive link?
Best Lisa
From: Jiacheng Miao @.> Sent: 08 April 2024 16:49 To: qlu-lab/POP-TOOLS @.> Cc: Eick, Lisa @.>; Author @.> Subject: Re: [qlu-lab/POP-TOOLS] Pop-Gwas not working (Issue #1)
Hi Lisa,
Thank you for your email and for preparing the summary statistics.
Regarding the sharing of the large dataset, we can consider using cloud storage services Google Drive, Dropbox or OneDrive or Box. you can upload the data to a shared folder and provide me with a link to access it. After that, I will look into POP-GWAS.
Best, Jiacheng
On Apr 8, 2024, at 7:04 AM, lisaeick @.***> wrote:
Hey Jiacheng,
Thanks for your fast reply and sorry for needing much time to answer, but with our download boundaries and easter it took me some time to get the sumstats.
If you know a way to share them (since they are big) let me know. (2.2gb)
I prepared the data exclusively for popgwas and split them randomly in labeled and unlabeled so there is no selection bias. Of course there is relation within the samples, since its one cohort, but also we were sure to avoid sample overlap.
I downloaded the pure sumstats and attached the rsids, to avoid that one of my preprocessing steps might be the error. As said please let me know how you would receive them and I am happy to share the sumstats.
Best Greetings Lisa
— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/qlu-lab/POP-TOOLS/issues/1*issuecomment-2042571410__;Iw!!Mak6IKo!I9-PiyePHuCQXCj-_MRLfhIVxruIqGAjtoIb7uegoOAHo1xH-_XY_2qidIZlv3LmWv6PeWOQFPCBGJIQK2FhAsYRAg$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ANLWM7GMSNG4E6MJTUXB2DTY4KBWTAVCNFSM6AAAAABFIVQY5CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBSGU3TCNBRGA__;!!Mak6IKo!I9-PiyePHuCQXCj-_MRLfhIVxruIqGAjtoIb7uegoOAHo1xH-_XY_2qidIZlv3LmWv6PeWOQFPCBGJIQK2ElVWgEcA$. You are receiving this because you were assigned.Message ID: @.***>
— Reply to this email directly, view it on GitHubhttps://github.com/qlu-lab/POP-TOOLS/issues/1#issuecomment-2042812422, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AQ45SVDCCRN2QNG6YXSS6XTY4KN7JAVCNFSM6AAAAABFIVQY5CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBSHAYTENBSGI. You are receiving this because you authored the thread.Message ID: @.***>
My e-mail is jiacheng.miao@wisc.edu
Best, Jiacheng
hey Jiacheng, I send you a share message via OneDrive. Please let me know if there are problems occuring. Thanks for your fast and friendly replies and best greetings Lisa
From: Jiacheng Miao @.> Sent: 08 April 2024 17:19 To: qlu-lab/POP-TOOLS @.> Cc: Eick, Lisa @.>; Author @.> Subject: Re: [qlu-lab/POP-TOOLS] Pop-Gwas not working (Issue #1)
My e-mail is @.**@.>
Best, Jiacheng
— Reply to this email directly, view it on GitHubhttps://github.com/qlu-lab/POP-TOOLS/issues/1#issuecomment-2042881165, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AQ45SVEZKXAKAUPGGNYAA6DY4KRNZAVCNFSM6AAAAABFIVQY5CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBSHA4DCMJWGU. You are receiving this because you authored the thread.Message ID: @.***>
Hi Lisa,
Thank you for your patience. I have made two updates to POP-GWAS to resolve this issue:
For 1, the original GWAS statistical summary had many SNPs with duplicate IDs, which would have messed up the calculations.
For 2, although there are no overlapping individuals in the labeled and unlabeled data, the non-zero intercept (0.15) of the bivariate LDSC between input GWAS indicates that there is a residual correlation, and the GWAS performed in these two samples cannot be considered truly independent. I have added a version to address this issue.
All you need to do is add --sample-overlap
to the POP-GWAS script. You may also need to update your POP-GWAS dependencies to the latest version.
A Manhattan plot (without MAF cutoff) using the updated POP-GWAS is attached:
If the MAF > 0.01 cutoff is applied, the Manhattan plot is
I have also emailed you the scripts to reproduce my results. Thank you for identifying the issues.
Best, Jiacheng
Dear Pop-GWAS team I tried your method in Finngen for a binary trait (CHD) which we do predict and making a continous GWAS using the probability values which the model outputs. The PopGWAS output is very inflated and very noisy. We exchanged the LDSC reference file to a finngen specific one and the inflation got better but the noisyness remains. Any ideas what went wrong?