Al-Murphy / MungeSumstats

Rapid standardisation and quality control of GWAS or QTL summary statistics
https://doi.org/doi:10.18129/B9.bioc.MungeSumstats
73 stars 15 forks source link

Extend quality control and standardisation to QTL analysis #77

Open Al-Murphy opened 2 years ago

Al-Murphy commented 2 years ago

Extend quality control and standardisation to QTL analysis. Following checks are specific to QTL studies and need to be added:

Al-Murphy commented 2 years ago

v1.5.11 can now handle QTL sumstats however it will only check the SNPs, not the effect region (gene for eQTLs). Note to set check_dups = FALSE when running MSS for QTLs

bschilder commented 2 years ago

That's awesome news! Standardizing gene names for eQTLs should be doable using orthogene:::map_genes(). This way, it can handle a variety of gene inputs (gene symbols, ensembl IDs, Entrez IDs, transcript IDs, UniProt IDs) onto standardized IDs (e.g. gene symbols). If you want to avoid having to install all the deps for orthogene, you could instead use the main function it relies on: gprofiler::gconvert() https://github.com/neurogenomics/orthogene/blob/4977da1e09074f5b063f1b0413aa00e08b65929b/R/map_genes.R#L61

For non gene/transcript/protein-based regions, I imagine some other approach would be necessary (e.g. for methylation QTLs).