Closed HerefordGuy closed 5 months ago
No, that's really the total number of variants considered.
The total number in the GWAS, correct? Thanks for clarifying!
No, ld_size is the number of variants that were used to compute LD scores. Isn't what it says in the documentation?
The number of variants used to compute an LD score varies from SNP to SNP. For example, from LD scores computed in GCTA the number of SNPs used to calculate an LD score varied from 5 to 2805. That is why GCTA has a column where they report "snp_num". GCTA-LDS: calculating LD score for each SNP bigsnpr doesn't seem to be able to handle variable number of variants used to calculate LD scores.
It could easily, but I think it is not supposed to be used like that; where did you see that?
If you fit a genetic distance or physical distance threshold, then the number of SNPs used to calculate the LD score is going to vary. If you set a fixed number of SNPs (leave infos.pos = NULL
), then the size of the windows to calculate the LD score are going to vary greatly from SNP to SNP.
Bulik-Sullivan et al. 2015 used a 1-cM window. https://doi.org/10.1038/ng.3211
I am probably misunderstanding something. My apologies.
In equation (1), my understanding was that M is the total number of variants for which you computed the LD scores, not the number used to compute the LD scores in each window.
Any update on this?
On line 76 of
ldsc.R
, the code requires thatld_size
be a single number. This is only true if the number of neighboring SNPs is set. But, ifsize
insnp_ld_scores()
is a physical or genetic distance, the number of SNPs used to calculate LD scores will vary for each SNP. Couldld_size
be set to be a vector of same length asld_score
?