bulik / ldsc

LD Score Regression (LDSC)
GNU General Public License v3.0
645 stars 343 forks source link

rg all vs all #298

Open zillurbmb51 opened 3 years ago

zillurbmb51 commented 3 years ago

Hi, I wanted to calculate genetic correlation between 11 phenotypes. From the wiki, I can see that ldsc.py can give us correlation between only the 1st trait vs all other traits.

(ldsc.py would compute the genetic correlation between the first file and the list and all subsequent files (i.e., --rg a,b,c will compute rg(a,b) and rg(a,c) ),

How could I get rg(b,c)? How could I get correlation between all trait vs all trait so that I could create a heatmap presented at https://www.med.unc.edu/pgc/wp-content/uploads/sites/959/2019/01/pgc_stat_bulik_2015.pdf in page 24,25. For few traits, we could easily change the 1st trait, but for many traits, it is difficult. Any help?

shafiqnoa commented 2 years ago

Hi @zillurbmb51 have you solved this? I am experiencing the same chanllenge. @rkwalters and team, can you help?

michaelofrancis commented 2 years ago

I'm not an admin here but I'm pretty sure this is a coding question, and has nothing to do with LDSC. You should know basic coding to be doing bioinformatics...

https://www.geeksforgeeks.org/looping-statements-shell-script/

shafiqnoa commented 2 years ago

@michaelofrancis, thanks for sharing the link. I understand there is an extension call cross-trait ldsr for doing that. Unfortunately, I could not figure out how to do so. If you have a loop code for achieving the same results, sharing would be appreciated.

michaelofrancis commented 2 years ago

This may come off as harsh but, being able to figure out how to run a shell loop on your own is a prerequisite to doing genomics research, which is orders of magnitude more complicated.

zillurbmb51 commented 2 years ago

@shafiqnoa The simplest solution I came up with the bash code below. It assumes that you have unequal number of trait1 and trait2. All the files have suffix .sumstats.gz . Hope it would help you to modify this script according to your data and directory structure.


declare -a trait1=("A" "B" "C")
declare -a trait2=("P" "Q" "R" "S" "T")
for tr2 in "${trait2[@]}"; do
 for tr1 in "${trait1[@]}"; do
  python /path/to/ldsc.py --rg ${tr1}.sumstats.gz,${tr2}.sumstats.gz --ref-ld-chr /path/to/eur_w_ld_chr/ --w-ld-chr /path/to/eur_w_ld_chr/ --out ${tr1}-${tr2}
 done
done 
shafiqnoa commented 2 years ago

@zillurbmb51 thank you very much for sharing the script.