Closed gmmhe closed 1 year ago
Hello,
The PGS scoring file for PGS000027 contains about 2.1 million variants across the entire genome. To calculate polygenic scores accurately, as described by the polygenic score authors, it's important that we only calculate scores using a similar number of variants.
By default we prevent scores being calculated if at least 75% of variants in the scoring file aren't present in the input target genomes (this parameter can be adjusted with --min_overlap
, but it's a bad idea to adjust normally).
There are a few technical reasons why a scoring file might match badly, like:
But a 10% match rate on 1 chromosome is quite good! I think if you try rerunning the workflow using all of your chromosomes the error should hopefully fix itself 😁 It's important to set up the split chromosomes in a single samplesheet (one row per chromosome).
Cheers, Ben
Thank you so much Ben, when I used all the chromosomes worked!
Hi,
I'm trying to run my first polygenic risk scores using PGS catalog. But I found an issue. I copy the error code below. I think the problem that I have, involves the preparation of my input genomes. I used plink2 v2.00a3.7 64-bit and I set up the chromosomes using your example code following this documentation https://pgsc-calc.readthedocs.io/en/dev/how-to/prepare.html :
./plink2 --vcf chr21.merged.clean.noMono.vcf.gz \ --allow-extra-chr \ --chr 1-22, X, Y, XY \ --make-pgen --out chr21_axy
When running my command in pgsc_calc, I run this:
./nextflow run pgscatalog/pgsc_calc \ -profile docker \ --input samplesheet3.csv --target_build GRCh38 \ --pgs_id PGS 000027 --target_build GRCh38
It seems that the problem is with -chrom parameter, but I was following the steps (I did not use all the chromosomes yet, I tried first with 1 chromosome and later with 3, but I don't think this is the problem). So I cannot see where is the issue. Copy here the error:
What Can I do?
Thanks in advance!