statgen / pheweb

A tool to build a website to browse hundreds or thousands of GWAS.
MIT License
154 stars 65 forks source link

pheweb-rg-pipeline does NOT work #220

Closed jielab closed 2 months ago

jielab commented 4 months ago

Hi, there:

Previously I reported that I could NOT make pheweb-rg-pipeline work. Basically, conda install LDSC runs under python2. I could install nextflow through conda install nextflow but it will NOT run after I execute conda activate LDSC. So, can you please give a step-by-step and correct tutorial on how to install pheweb-rg-pipeline.

The Github page for pheweb-rg-pipeline said "inside the LDSC/nextflow.config file". But where is that file? As I said, I installed LDSC through conda install LDSC. Should I also git clone the LDSC repository?

Finally, I decided to run LDSC locally and separately, and I generated an output file like below: image

However, after I put this file into the same folder as pheweb, I got the following error: image

Can someone please help out here? I would hate to see that I could not make the amazing pheweb-rg-pipeline work.

Thank you & best regards, Jie

sgagliano commented 4 months ago

Hi Jie,

Information on how to run the pheweb-rg pipeline is outlined in the README of the following GitHub repository: https://github.com/statgen/pheweb-rg-pipeline/tree/master. The file you are referring to (nextflow.config) is found in this repository, specifically here: https://github.com/statgen/pheweb-rg-pipeline/blob/master/LDSC/nextflow.config.

As a side note, based on the naming of your correlations file, it seems that you are testing the same trait (height) across different genetic ancestries. As noted in the pheweb-rg pipeline README, it is currently set-up to run on summary statistics derived from individuals of European genetic ancestry (i.e. the pipeline is using LDSC-supplied LD scores based off of LD patterns in the 1000 Genomes European super-population). I would suggest exploring other summary statistics-based tools for estimating genetic correlation assuming a diverse set of samples.

Hope this helps.