kauwelab / PolyRiskScore

PRSKB is a website and command-line interface tool for calculating polygenic risk scores using GWA studies from the NHGRI-EBI Catalog.
23 stars 1 forks source link

GWAS summary statistics uploading #431

Closed alisonomica closed 8 months ago

alisonomica commented 10 months ago

Hello PRSKB community, I am trying to adapt the summary statistics I had obtain from a GWAS to use the CLI tool, but I have some doubts about the format. Is there an specific header that we have to use at the columns? For example mine has the CHR header and the documentation says that the "Chromosome" column is required, so idk if I have to change the name of that column. If there is some example file you can provide me to check the structure would be awesome. Also I want to known if I would need to perform the QC of both the vcf and the GWAS data before use the tool or if you already do it within the pipeline. Thank you in advance for your time.

alisonomica commented 10 months ago

Hello again, I'm trying to run the CLI tool with my own GWAS summary statistics. I already change the format to be consistent with the documentation, but I'm receiving this error:

requests.exceptions.HTTPError: 429 Client Error: Too Many Requests for url: http://myvariant.info/v1/query/?fields=dbsnp.alleles.allele%2C+dbsnp.dbsnp_merges%2C+dbsnp.gene.strand%2C+dbsnp.alt%2C+dbsnp.ref&q=dbsnp.rsid%3Ars778228 AN ERROR HAS CAUSED THE TOOL TO EXIT... Quitting

The command I'm running is: ./runPrsCLI.sh -f /root/PRSKB/vcfs/HG00152.30x.deepvariant.vcf.gz -o ./PRS_GWAS.json -c 0.0005 -r hg38 -p EUR -u ./GWASu.tsv

I think it's something related to the API request, but I'm not sure.

miller34 commented 10 months ago

Hello, thanks for using the PRSKB. Hopefully you are no longer experiencing the issue. There was unscheduled maintenance that took our servers down for a few days and might have caused an issue with API access. Everything should be working again. Thank you for your patience.

If the issue persists, you can run the PRSKB in two steps using the -s option. If there's an issue accessing the API, it will occur in step 1, since step 2 is run completely locally.

In terms of QC, we perform some QC on the GWAS/PRS calculations (e.g., linkage analyses to choose the SNPs that are included in calculations, ensuring that a user-specified threshold of variants are included in the PRS calculation, and filtering GWAS that don't have sufficient SNPs), but most QC (e.g., base/variant quality scores, etc.) should be done before using the PRSKB. We also report PRS based on any GWAS submitted to the GWAS Catalog and do not attempt to assess the QC completed in studies submitted to the GWAS Catalog.