Open xflicsu opened 1 year ago
If you are specifically referring to the file 'GWAS_summ_example.txt', I obtained the GWAS summary data for PBC from UKBiobank. To optimize runtime for testing, I randomly selected 10,000 SNPs from the entire pool of SNPs. It is worth noting that the actual GWAS dataset comprises millions of SNPs. The databases you mentioned serve as reliable sources for accessing GWAS data, and FinnGen is also renowned in this regard.
***@***.***
---- Replied Message ----
From
Xianfeng ***@***.***>
Date
9/20/2023 12:23
To
***@***.***>
Cc
***@***.***>
Subject
[sulab-wmu/scPagwas] How can we prepare the GWAS from public database? (Issue #10)
I wonder where can we download GWAS file from public database with information in "GWAS_summ_example.txt"? As we know there are several public GWAS database, like GWAS Catalog, IEU OpenGWAS project, GWAS atlas, GWAS Atlas - NGDC/CNCB, etc.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>
If you are specifically referring to the file 'GWAS_summ_example.txt', I obtained the GWAS summary data for PBC from UKBiobank. To optimize runtime for testing, I randomly selected 10,000 SNPs from the entire pool of SNPs. It is worth noting that the actual GWAS dataset comprises millions of SNPs. The databases you mentioned serve as reliable sources for accessing GWAS data, and FinnGen is also renowned in this regard. @. … ---- Replied Message ---- From Xianfeng @.> Date 9/20/2023 12:23 To @.> Cc @.> Subject [sulab-wmu/scPagwas] How can we prepare the GWAS from public database? (Issue #10) I wonder where can we download GWAS file from public database with information in "GWAS_summ_example.txt"? As we know there are several public GWAS database, like GWAS Catalog, IEU OpenGWAS project, GWAS atlas, GWAS Atlas - NGDC/CNCB, etc. —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>
Thanks for quick response. Most public database only provide the significant SNPs related to one phenotype. To dig out cell type related to one phenotype, should we must provide all the SNPs from one GWAS study. BTW, how can I collect those information ? Thanks again!
To calculate scPagwas, it is best to provide a phenotype with all the SNPs. If all the SNPs cannot be obtained, then it is important to consider if the number of obtained SNPs is sufficient. However, we have not specifically evaluated how many missing SNPs would affect the results (which is indeed a meaningful evaluation direction). In our calculations, the smallest datasets we used have at least 800,000 SNPs. The more SNPs there are, the more accurate the regression results will be. You can find and directly download the desired phenotype from several public database addresses below:https://gwas.mrcieu.ac.uk/ Download the data like GCST90041886_buildGRCh37.tsv.gz from here: http://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/
***@***.***
---- Replied Message ----
From
Xianfeng ***@***.***>
Date
9/20/2023 18:39
To
***@***.***>
Cc
Chunyu ***@***.***>
,
***@***.***>
Subject
Re: [sulab-wmu/scPagwas] How can we prepare the GWAS from public database? (Issue #10)
If you are specifically referring to the file 'GWAS_summ_example.txt', I obtained the GWAS summary data for PBC from UKBiobank. To optimize runtime for testing, I randomly selected 10,000 SNPs from the entire pool of SNPs. It is worth noting that the actual GWAS dataset comprises millions of SNPs. The databases you mentioned serve as reliable sources for accessing GWAS data, and FinnGen is also renowned in this regard. @.*** … ---- Replied Message ---- From Xianfeng @.> Date 9/20/2023 12:23 To @.> Cc @.> Subject [sulab-wmu/scPagwas] How can we prepare the GWAS from public database? (Issue #10) I wonder where can we download GWAS file from public database with information in "GWAS_summ_example.txt"? As we know there are several public GWAS database, like GWAS Catalog, IEU OpenGWAS project, GWAS atlas, GWAS Atlas - NGDC/CNCB, etc. —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.>
Thanks for quick response. Most public database only provide the significant SNPs related to one phenotype. To dig out cell type related to one phenotype, should we must provide all the SNPs from one GWAS study. BTW, how can I collect those information ? Thanks again!
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>
To calculate scPagwas, it is best to provide a phenotype with all the SNPs. If all the SNPs cannot be obtained, then it is important to consider if the number of obtained SNPs is sufficient. However, we have not specifically evaluated how many missing SNPs would affect the results (which is indeed a meaningful evaluation direction). In our calculations, the smallest datasets we used have at least 800,000 SNPs. The more SNPs there are, the more accurate the regression results will be. You can find and directly download the desired phenotype from several public database addresses below:https://gwas.mrcieu.ac.uk/ Download the data like GCST90041886_buildGRCh37.tsv.gz from here: http://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/ @. … ---- Replied Message ---- From Xianfeng @.> Date 9/20/2023 18:39 To @.> Cc Chunyu @.> , @.> Subject Re: [sulab-wmu/scPagwas] How can we prepare the GWAS from public database? (Issue #10) If you are specifically referring to the file 'GWAS_summ_example.txt', I obtained the GWAS summary data for PBC from UKBiobank. To optimize runtime for testing, I randomly selected 10,000 SNPs from the entire pool of SNPs. It is worth noting that the actual GWAS dataset comprises millions of SNPs. The databases you mentioned serve as reliable sources for accessing GWAS data, and FinnGen is also renowned in this regard. @. … ---- Replied Message ---- From Xianfeng @.> Date 9/20/2023 12:23 To @.> Cc @.> Subject [sulab-wmu/scPagwas] How can we prepare the GWAS from public database? (Issue #10) I wonder where can we download GWAS file from public database with information in "GWAS_summ_example.txt"? As we know there are several public GWAS database, like GWAS Catalog, IEU OpenGWAS project, GWAS atlas, GWAS Atlas - NGDC/CNCB, etc. —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.> Thanks for quick response. Most public database only provide the significant SNPs related to one phenotype. To dig out cell type related to one phenotype, should we must provide all the SNPs from one GWAS study. BTW, how can I collect those information ? Thanks again! —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>
Thanks for your useful detail suggestion. To calculate scPagwas, we may have other questions. So, can we communicat with email (my gmail: xfliwz@gmail.com ) or Wechat?
You can communicate with me by email in Chinese or any language you want @.***
Xianfeng Li @.***> 于2023年9月20日周三 20:37写道:
To calculate scPagwas, it is best to provide a phenotype with all the SNPs. If all the SNPs cannot be obtained, then it is important to consider if the number of obtained SNPs is sufficient. However, we have not specifically evaluated how many missing SNPs would affect the results (which is indeed a meaningful evaluation direction). In our calculations, the smallest datasets we used have at least 800,000 SNPs. The more SNPs there are, the more accurate the regression results will be. You can find and directly download the desired phenotype from several public database addresses below:https://gwas.mrcieu.ac.uk/ Download the data like GCST90041886_buildGRCh37.tsv.gz from here: http://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/ @. … <#m2329877943226290018> ---- Replied Message ---- From Xianfeng @.> Date 9/20/2023 18:39 To @.> Cc Chunyu @.> , @.> Subject Re: [sulab-wmu/scPagwas] How can we prepare the GWAS from public database? (Issue #10 https://github.com/sulab-wmu/scPagwas/issues/10) If you are specifically referring to the file 'GWAS_summ_example.txt', I obtained the GWAS summary data for PBC from UKBiobank. To optimize runtime for testing, I randomly selected 10,000 SNPs from the entire pool of SNPs. It is worth noting that the actual GWAS dataset comprises millions of SNPs. The databases you mentioned serve as reliable sources for accessing GWAS data, and FinnGen is also renowned in this regard. @. … ---- Replied Message ---- From Xianfeng @.> Date 9/20/2023 12:23 To @.> Cc @.> Subject [sulab-wmu/scPagwas] How can we prepare the GWAS from public database? (Issue #10 https://github.com/sulab-wmu/scPagwas/issues/10) I wonder where can we download GWAS file from public database with information in "GWAS_summ_example.txt"? As we know there are several public GWAS database, like GWAS Catalog, IEU OpenGWAS project, GWAS atlas, GWAS Atlas - NGDC/CNCB, etc. —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.> Thanks for quick response. Most public database only provide the significant SNPs related to one phenotype. To dig out cell type related to one phenotype, should we must provide all the SNPs from one GWAS study. BTW, how can I collect those information ? Thanks again! —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>
Thanks for your useful detail suggestion. To calculate scPagwas, we may have other questions. So, can we communicat with email (my gmail: @.*** ) or Wechat?
— Reply to this email directly, view it on GitHub https://github.com/sulab-wmu/scPagwas/issues/10#issuecomment-1727640260, or unsubscribe https://github.com/notifications/unsubscribe-auth/AILWCUA6PWJ3TJYMDNBCP6LX3LPRJANCNFSM6AAAAAA47IMWNU . You are receiving this because you commented.Message ID: @.***>
You can communicate with me by email in Chinese or any language you want @. Xianfeng Li @.> 于2023年9月20日周三 20:37写道: … To calculate scPagwas, it is best to provide a phenotype with all the SNPs. If all the SNPs cannot be obtained, then it is important to consider if the number of obtained SNPs is sufficient. However, we have not specifically evaluated how many missing SNPs would affect the results (which is indeed a meaningful evaluation direction). In our calculations, the smallest datasets we used have at least 800,000 SNPs. The more SNPs there are, the more accurate the regression results will be. You can find and directly download the desired phenotype from several public database addresses below:https://gwas.mrcieu.ac.uk/ Download the data like GCST90041886_buildGRCh37.tsv.gz from here: http://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/ @. … <#m2329877943226290018> ---- Replied Message ---- From Xianfeng @.> Date 9/20/2023 18:39 To @.> Cc Chunyu @.> , @.> Subject Re: [sulab-wmu/scPagwas] How can we prepare the GWAS from public database? (Issue #10 <#10>) If you are specifically referring to the file 'GWAS_summ_example.txt', I obtained the GWAS summary data for PBC from UKBiobank. To optimize runtime for testing, I randomly selected 10,000 SNPs from the entire pool of SNPs. It is worth noting that the actual GWAS dataset comprises millions of SNPs. The databases you mentioned serve as reliable sources for accessing GWAS data, and FinnGen is also renowned in this regard. @. … ---- Replied Message ---- From Xianfeng @.> Date 9/20/2023 12:23 To @.> Cc @.> Subject [sulab-wmu/scPagwas] How can we prepare the GWAS from public database? (Issue #10 <#10>) I wonder where can we download GWAS file from public database with information in "GWAS_summ_example.txt"? As we know there are several public GWAS database, like GWAS Catalog, IEU OpenGWAS project, GWAS atlas, GWAS Atlas - NGDC/CNCB, etc. —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.> Thanks for quick response. Most public database only provide the significant SNPs related to one phenotype. To dig out cell type related to one phenotype, should we must provide all the SNPs from one GWAS study. BTW, how can I collect those information ? Thanks again! —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.> Thanks for your useful detail suggestion. To calculate scPagwas, we may have other questions. So, can we communicat with email (my gmail: @. ) or Wechat? — Reply to this email directly, view it on GitHub <#10 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AILWCUA6PWJ3TJYMDNBCP6LX3LPRJANCNFSM6AAAAAA47IMWNU . You are receiving this because you commented.Message ID: @.***>
Thanks! The github may hide the real email address. I can not see your ID. My email: xfliwz at gmail dot com.
@dengchunyu 您好,我下载的buildGRCh37.tsv.gz文件中不存在rsid列,文件内容示例如下: chromosome base_pair_location effect_allele other_allele effect_allele_frequency beta standard_error p_value variant_id 1 751343 A T 0.147797863296056 -0.0371990816265321 0.0246716881017141 0.131614954629239 NA 我该如何处理得到GWAS_summ_example.txt文件所需的rsid列呢?您有推荐的方法吗?谢谢🙏 另外,effect_allele_frequency可以作为maf列吗? 数据链接 http://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/GCST90018001-GCST90019000/GCST90018629/
第一个问题可以从这个网址中找到答案https://www.biostars.org/p/72066/ 第二个问题,effect_allele_frequency并不等于maf。EAFs 的范围在 0 到 1 之间,与表型("效应",特定数据集中效应等位基因的频率)相关。相比之下,MAFs 代表的是小等位基因的频率,因此它的范围在 0 到 0.5 之间。将所有大于 0.5 的EAFs等位基因频率值用 1 减去,就变成了 MAF。
@dengchunyu 万分感谢您!
I wonder where can we download GWAS file from public database with information in "GWAS_summ_example.txt"?
As we know there are several public GWAS database, like GWAS Catalog, IEU OpenGWAS project, GWAS atlas, GWAS Atlas - NGDC/CNCB, etc.