privefl / bigsnpr

R package for the analysis of massive SNP arrays.
https://privefl.github.io/bigsnpr/
196 stars 44 forks source link

I keep getting PGSs equal 0 #516

Closed trinetn closed 1 month ago

trinetn commented 1 month ago

Dear Florian

I have been using LDpred2 to estimate PG-scores for a while now and I have been very happy with the tool. I am currently having an issue with an LDpred2 analysis. I hope you might be able to help us. Any help is much appreciated!!

I want to estimate PGSs using weights estimated with LDpred2 using a reference sample, but for some reason I keep getting PGSs equal 0. My code can be seen below:

#---------------------------------------------------------------------------------------------------------------------------------------------------------------------

#First I load my libraries
library(bigsnpr)
library(dplyr)

#Next I load my weights that I have previously estimated with a reference genome and a GWAS sumstats – I also load a table with information about which SNPs where used to estimate weights and which alleles were a1 – I think you often call this table df_beta
load("weights.RData")
df_beta <- read.table(file=”df_beta_table.txt”, header=T)

#I now load my big SNP object – I only have 3 individuals in this object, since I am still running this as test – The object was created with snp_readBed() and I have restricted my object to hapmap3 + SNPs
obj.bigSNP <- snp_attach("Test_object.rds")

#I define the G matrix and the map variants:
G   <- obj.bigSNP$genotypes
map <- dplyr::transmute(obj.bigSNP$map,
chr = chromosome, pos = physical.pos,a0 = allele2, a1 = allele1,rsid=marker.ID)

#I obtain pos, a1, a0, rsid and chr from the df_beta and create a beta
map_pgs <- df_beta[1:5]
map_pgs$beta <- 1

#I identify variants that were used to estimate the weights and create a beta that reflect a potential allele flip
map_pgs2 <- snp_match(map_pgs, map,join_by_pos=F)

#I  now run the last step, where I want to estimate my PGSs, but for some reason I keep getting PGSs equal 0 for each of my individuals
final_pred <- big_prodVec(G, final_beta_auto * map_pgs2$beta,
                         ind.col = map_pgs2[["_NUM_ID_"]],
                         ncores = NCORES)

#----------------------------------------------------------------------------------------------

Can you think of any thing that I am doing wrong? I would be grateful for any suggestions

Best wishes Trine

privefl commented 1 month ago

So, final_pred is only 0s?

I would check two things:

trinetn commented 1 month ago

Yes, final_pred is only 0s

The length of final_beta_auto is around 1,1 mio SNPs and the vector contains only non-zero values

privefl commented 1 month ago

And G does not contain only 0s?

trinetn commented 1 month ago

No, it contains 0s,1s and 2s for all three individuals

privefl commented 1 month ago

Odd. Is this on GenomeDK, somewhere I can access? If you can save both final_beta_auto and map_pgs2 in some rds file, close to "Test_object.rds" somewhere I can access, I can have a look at it.

trinetn commented 1 month ago

Yes, I am working in the closed zone Brain on genomeDK, but I can´t share data with you - it is data from the ABCD cohort. I will try with a new target sample (a dataset from the Plink webpage) and see if I get the same results. I will write back when I have run the new analysis. Thank you for now

trinetn commented 1 month ago

Update: Even after running my code with a different dataset, I still get PGSs equal zero. The issue seems to stem from the nb_cores() function (actually not shown in the code above), the function returns 0 and depending on, if I run the nb_cores function or not I get PGSs equal zero or non-zero values

privefl commented 1 month ago

Thanks for this. Seems to be a bug on my end.

What do you get for

parallelly::availableCores(logical = FALSE)
parallelly::availableCores(logical = TRUE)
Sys.getenv("SLURM_JOB_CPUS_PER_NODE")
packageVersion("bigparallelr")
packageVersion("parallelly")
trinetn commented 1 month ago

I get: nproc 1 nproc 1 [1] "1" [1] "0.3.1" [1] "1.27.0"

privefl commented 1 month ago

What if you update to the latest CRAN version install.packages("bigparallelr") and try bigstatsr::nb_cores() again?

trinetn commented 1 month ago

Then I get: [1] 1

privefl commented 1 month ago

Perfect. It seems to be a bug I fixed 3y ago ahah. Glad that it now works for you.