I used the following code to calculate PGS for my UK biobank research by PGS003765 research results, and successfully obtained the result, but my variants matched only 12.9%. Is this normal? Previously, I researched this issue and found that low matching could be due to the wrong genome build or a lack of imputation. For the first reason, I checked the official website of UK biobank and found that GRCh37 was indeed used for genome build. As for the second question, I am not sure whether UK biobank data has completed imputation. I really do not know about this aspect. Could you assist in identifying the cause of the low variant matching percentage?
Thank you!
(base) ubuntu@VM-16-6-ubuntu:~$ nextflow run pgscatalog/pgsc_calc \
-profile singularity \
--input samplesheet.csv --target_build GRCh37 \
--pgs_id PGS003765 \
--run_ancestry /home/ubuntu/pgsc_HGDP+1kGP_v1.tar.zst
N E X T F L O W ~ version 23.10.1
Launching `https://github.com/pgscatalog/pgsc_calc` [cheeky_bell] DSL2 - revision: 8bdf287d55 [main]
WARN: Access to undefined parameter `monochromeLogs` -- Initialise it to a default value eg. `params.monochromeLogs = some_value`
------------------------------------------------------
pgscatalog/pgsc_calc v2.0.0-alpha.5-g8bdf287
------------------------------------------------------
Core Nextflow options
revision : main
runName : cheeky_bell
containerEngine : singularity
launchDir : /home/ubuntu
workDir : /home/ubuntu/work
projectDir : /home/ubuntu/.nextflow/assets/pgscatalog/pgsc_calc
userName : ubuntu
profile : singularity
configFiles :
Input/output options
input : samplesheet.csv
pgs_id : PGS003765
outdir : results
Reference options
run_ancestry : /home/ubuntu/pgsc_HGDP+1kGP_v1.tar.zst
ref_samplesheet : /home/ubuntu/.nextflow/assets/pgscatalog/pgsc_calc/assets/ancestry/reference.csv
ld_grch37 : /home/ubuntu/.nextflow/assets/pgscatalog/pgsc_calc/assets/ancestry/high-LD-regions-hg19-GRCh37.txt
ld_grch38 : /home/ubuntu/.nextflow/assets/pgscatalog/pgsc_calc/assets/ancestry/high-LD-regions-hg38-GRCh38.txt
ancestry_checksums: /home/ubuntu/.nextflow/assets/pgscatalog/pgsc_calc/assets/ancestry/checksums.txt
Compatibility options
target_build : GRCh37
Matching options
min_overlap : 0
executor > local (56)
[9c/70786c] process > PGSCATALOG_PGSCCALC:PGSCCALC:DOWNLOAD_SCOREFILES ([pgs_id:PGS003765, pgp_id:, trait_efo:]) [100%] 1 of 1 ✔
[d9/08eea4] process > PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES (1) [100%] 1 of 1 ✔
[- ] process > PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELBIM -
[skipped ] process > PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELPVAR (ukb chromosome 9) [100%] 24 of 24, stored: 24 ✔
executor > local (56)
[9c/70786c] process > PGSCATALOG_PGSCCALC:PGSCCALC:DOWNLOAD_SCOREFILES ([pgs_id:PGS003765, pgp_id:, trait_efo:]) [100%] 1 of 1 ✔
[d9/08eea4] process > PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES (1) [100%] 1 of 1 ✔
[- ] process > PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELBIM -
[skipped ] process > PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELPVAR (ukb chromosome 9) [100%] 24 of 24, stored: 24 ✔
executor > local (56)
[9c/70786c] process > PGSCATALOG_PGSCCALC:PGSCCALC:DOWNLOAD_SCOREFILES ([pgs_id:PGS003765, pgp_id:, trait_efo:]) [100%] 1 of 1 ✔
[d9/08eea4] process > PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES (1) [100%] 1 of 1 ✔
[- ] process > PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELBIM -
[skipped ] process > PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELPVAR (ukb chromosome 9) [100%] 24 of 24, stored: 24 ✔
executor > local (56)
[9c/70786c] process > PGSCATALOG_PGSCCALC:PGSCCALC:DOWNLOAD_SCOREFILES ([pgs_id:PGS003765, pgp_id:, trait_efo:]) [100%] 1 of 1 ✔
[d9/08eea4] process > PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES (1) [100%] 1 of 1 ✔
[- ] process > PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELBIM -
[skipped ] process > PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELPVAR (ukb chromosome 9) [100%] 24 of 24, stored: 24 ✔
executor > local (56)
[9c/70786c] process > PGSCATALOG_PGSCCALC:PGSCCALC:DOWNLOAD_SCOREFILES ([pgs_id:PGS003765, pgp_id:, trait_efo:]) [100%] 1 of 1 ✔
[d9/08eea4] process > PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES (1) [100%] 1 of 1 ✔
[- ] process > PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELBIM -
[skipped ] process > PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELPVAR (ukb chromosome 9) [100%] 24 of 24, stored: 24 ✔
executor > local (57)
[9c/70786c] process > PGSCATALOG_PGSCCALC:PGSCCALC:DOWNLOAD_SCOREFILES ([pgs_id:PGS003765, pgp_id:, trait_efo:]) [100%] 1 of 1 ✔
[d9/08eea4] process > PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES (1) [100%] 1 of 1 ✔
[- ] process > PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELBIM -
[skipped ] process > PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELPVAR (ukb chromosome 9) [100%] 24 of 24, stored: 24 ✔
executor > local (58)
[9c/70786c] process > PGSCATALOG_PGSCCALC:PGSCCALC:DOWNLOAD_SCOREFILES ([pgs_id:PGS003765, pgp_id:, trait_efo:]) [100%] 1 of 1 ✔
[d9/08eea4] process > PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES (1) [100%] 1 of 1 ✔
[- ] process > PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELBIM -
[skipped ] process > PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELPVAR (ukb chromosome 9) [100%] 24 of 24, stored: 24 ✔
executor > local (58)
[9c/70786c] process > PGSCATALOG_PGSCCALC:PGSCCALC:DOWNLOAD_SCOREFILES ([pgs_id:PGS003765, pgp_id:, trait_efo:]) [100%] 1 of 1 ✔
[d9/08eea4] process > PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES (1) [100%] 1 of 1 ✔
[- ] process > PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELBIM -
[skipped ] process > PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELPVAR (ukb chromosome 9) [100%] 24 of 24, stored: 24 ✔
executor > local (58)
[9c/70786c] process > PGSCATALOG_PGSCCALC:PGSCCALC:DOWNLOAD_SCOREFILES ([pgs_id:PGS003765, pgp_id:, trait_efo:]) [100%] 1 of 1 ✔
[d9/08eea4] process > PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES (1) [100%] 1 of 1 ✔
[- ] process > PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELBIM -
[skipped ] process > PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELPVAR (ukb chromosome 9) [100%] 24 of 24, stored: 24 ✔
[- ] process > PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_VCF -
executor > local (58)
[9c/70786c] process > PGSCATALOG_PGSCCALC:PGSCCALC:DOWNLOAD_SCOREFILES ([pgs_id:PGS003765, pgp_id:, trait_efo:]) [100%] 1 of 1 ✔
[d9/08eea4] process > PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES (1) [100%] 1 of 1 ✔
[- ] process > PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELBIM -
[skipped ] process > PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELPVAR (ukb chromosome 9) [100%] 24 of 24, stored: 24 ✔
[- ] process > PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_VCF -
[skipped ] process > PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:EXTRACT_DATABASE (1) [100%] 1 of 1, stored: 1 ✔
[skipped ] process > PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:INTERSECT_VARIANTS (ukb chromosome 1) [100%] 24 of 24, stored: 24 ✔
[skipped ] process > PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:FILTER_VARIANTS (ukb GRCh37) [100%] 1 of 1, stored: 1 ✔
[skipped ] process > PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:PLINK2_MAKEBED_REF (reference) [100%] 1 of 1, stored: 1 ✔
[skipped ] process > PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:INTERSECT_THINNED (ukb) [100%] 1 of 1, stored: 1 ✔
[skipped ] process > PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:RELABEL_IDS (ukb null pvar) [100%] 1 of 1, stored: 1 ✔
[skipped ] process > PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:PLINK2_MAKEBED_TARGET (ukb) [100%] 1 of 1, stored: 1 ✔
[skipped ] process > PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:PLINK2_ORIENT (ukb) [100%] 1 of 1, stored: 1 ✔
[skipped ] process > PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:FRAPOSA_PCA (reference) [100%] 1 of 1, stored: 1 ✔
[5b/4a8e3a] process > PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:FRAPOSA_PROJECT (ukb) [100%] 10 of 10, stored: 6 ✔
[6f/a71228] process > PGSCATALOG_PGSCCALC:PGSCCALC:MATCH:MATCH_VARIANTS (ukb chromosome 8) [100%] 24 of 24 ✔
[5d/219be1] process > PGSCATALOG_PGSCCALC:PGSCCALC:MATCH:MATCH_COMBINE (ukb) [100%] 1 of 1 ✔
[a3/15c1b7] process > PGSCATALOG_PGSCCALC:PGSCCALC:APPLY_SCORE:RELABEL_SCOREFILES (ukb additive scorefile) [100%] 1 of 1 ✔
[skipped ] process > PGSCATALOG_PGSCCALC:PGSCCALC:APPLY_SCORE:RELABEL_AFREQ (ukb null afreq) [100%] 1 of 1, stored: 1 ✔
[47/e17321] process > PGSCATALOG_PGSCCALC:PGSCCALC:APPLY_SCORE:PLINK2_SCORE (reference chromosome ALL effect type additive 0) [100%] 22 of 22 ✔
[4c/74ab17] process > PGSCATALOG_PGSCCALC:PGSCCALC:APPLY_SCORE:SCORE_AGGREGATE (ukb) [100%] 1 of 1 ✔
[11/afd94a] process > PGSCATALOG_PGSCCALC:PGSCCALC:REPORT:ANCESTRY_ANALYSIS (1) [100%] 1 of 1 ✔
[d8/1a6e55] process > PGSCATALOG_PGSCCALC:PGSCCALC:REPORT:SCORE_REPORT (ukb) [100%] 1 of 1 ✔
[a0/421899] process > PGSCATALOG_PGSCCALC:PGSCCALC:DUMPSOFTWAREVERSIONS (1) [100%] 1 of 1 ✔
-[pgscatalog/pgsc_calc] Pipeline completed successfully-
Completed at: 13-Apr-2024 23:49:20
Duration : 1h 16m 4s
CPU hours : 5.0
Succeeded : 58
I used the following code to calculate PGS for my UK biobank research by PGS003765 research results, and successfully obtained the result, but my variants matched only 12.9%. Is this normal? Previously, I researched this issue and found that low matching could be due to the wrong genome build or a lack of imputation. For the first reason, I checked the official website of UK biobank and found that GRCh37 was indeed used for genome build. As for the second question, I am not sure whether UK biobank data has completed imputation. I really do not know about this aspect. Could you assist in identifying the cause of the low variant matching percentage? Thank you!