neurogenomics / MAGMA_Celltyping

Find causal cell-types underlying complex trait genetics
https://neurogenomics.github.io/MAGMA_Celltyping
71 stars 31 forks source link

Try to reduce the size of example sumstats files to 30mb #46

Closed NathanSkene closed 2 years ago

NathanSkene commented 3 years ago

Could try filtering by frequency?

Reducing to HapMap3 SNPs.... I think thats what makes LDSC munged sumstats files so small

I'd try doing those and check if results look similar

bschilder commented 2 years ago

Reduced from 11 million --> 400k SNPs. Much faster to download and still produces results with MAGMA. Filtered by MAF @ 5% and a nominal p-value of .05.

path_formatted <- MAGMA.Celltyping::get_example_gwas(
  trait = "fluid_intelligence",
  munged = TRUE)

ss <- data.table::fread(path_formatted)
ss2 <- ss[MINOR_AF>=.05 & P<.05,] 
bschilder commented 2 years ago

I've now filtered both examples ("fluid_intelligence" and "prospective_memory"), and they can be accessed via the MAGMA.Celltyping::get_example_gwas function.

They're both now <40Mb (close enough!)