perslab / CELLECT

CELLECT (CELL-type Expression-specific integration for Complex Traits)
GNU General Public License v3.0
71 stars 19 forks source link

Add option to exclude MHC region for CELLECT-MAGMA #52

Open pascaltimshel opened 4 years ago

pascaltimshel commented 4 years ago

The Problem

CELLECT-LDSC does not include HLA signal, but CELLECT-MAGMA does.

The Solution

before running the regression, remove all genes in df_magma located within chr6:25Mb-34Mb in prioritize_annotations_snake.py.

Add EXCLUDE_MHC to config.yml:

MAGMA_CONST:
  DATA_DIR: # Path to the data used for CELLECT-MAGMA (baseline model, gene mapping etc).
    data/magma
  NUMPY_CORES: # Numpy by default uses all cores available to it. This variable limits the amount of cores numpy can use to 1 so snakemake has full control over multicore processing via the '-j' flag.
    1
  EXCLUDE_MHC: # Exclude genes located in the MHC region (chr6:25Mb-34Mb) during prioritization. 
# We recommend setting this to True, unless you know what you are doing and analysing immune GWAS traits.
# CELLECT-LDSC also does not include genetic signal from the MHC region.
    True