stephenslab / susieR

R package for "sum of single effects" regression.
https://stephenslab.github.io/susieR
Other
169 stars 42 forks source link

Confirmation of the sample size value `n` #224

Open laleoarrow opened 2 months ago

laleoarrow commented 2 months ago

Hi all, I found the n in the source code of the coloc package’s susie function, where there is a section that handles the n parameter for sample size like:

## at 0.12.6 susieR introduced need for n = sample size
  if(!("n" %in% names(susie_args)))
      susie_args=c(list(n=d$N), susie_args)

If Im getting this right, it’s should be that the n=samplesize (for gwas summary data) parameter is only applicable when an in-sample LD matrix is used. When the LD matrix is inferred using a reference panel, such as the 1kg panel, then nshould represent the sample size of the reference panel. For instance, if the LD matrix is calculated using the European subset of the 1kg reference individual data, n should be approximately 500?

gaow commented 2 months ago

The sample size is relevant to z-scores, ie, how many samples are used to compute the association summary statistics. It should not have to do with the sample size for LD reference panels.

pcarbo commented 2 months ago

Correct, N is the sample size for the assocation statistics (e.g., z-scores).

pcarbo commented 2 months ago

@leoarrow1 If the documentation is unclear in coloc, please post an issue on the coloc GitHub, and feel free to reference this discussion here.

laleoarrow commented 2 months ago

Thanks for the confirmation.