Closed zhangcc89claire closed 3 years ago
Hi, the colocalization analysis needs the sample size for the gene expression / weights data. You can see the sample size for the public weights here (http://gusevlab.org/projects/fusion/#reference-functional-data). You can include this sample size in two ways:
Adding an "N" column to the --weights
file that lists the sample size for each weight.
Adding a "PANEL" column to the --weights
file that lists the name of the expression dataset, and then providing a --PANELN [file] flag to the analysis where [file] looks like this:
PANEL N
name1 size1
name2 size2
name3 size3
See also the documentation here: https://gusevlab.org/projects/fusion/#colocalization-analysis-with-coloc
Hi, the colocalization analysis needs the sample size for the gene expression / weights data. You can see the sample size for the public weights here (http://gusevlab.org/projects/fusion/#reference-functional-data). You can include this sample size in two ways:
- Adding an "N" column to the
--weights
file that lists the sample size for each weight.- Adding a "PANEL" column to the
--weights
file that lists the name of the expression dataset, and then providing a --PANELN [file] flag to the analysis where [file] looks like this:PANEL N name1 size1 name2 size2 name3 size3
See also the documentation here: https://gusevlab.org/projects/fusion/#colocalization-analysis-with-coloc
Does PANELN [file] looks like this?
PANEL N gene1.wgt.data 452 gene2.wgt.data 452 gene3.wgt.data 452
Thanks!
Best, CC
Sorry for the confusion, the PANELN files should look like:
PANEL N
CMC 452
and then the --weights
file should have a column PANEL
with the value CMC
for all individuals.
--weights file should have a column PANEL with the value CMC for all individuals
WGT ID CHR P0 P1 N CMC.BRAIN.RNASEQ/CMC.LOC643837.wgt.RDat LOC643837 1 762970 794826 452 CMC.BRAIN.RNASEQ/CMC.AGRN.wgt.RDat AGRN 1 955502 991499 452
Should I add the last column in the POS file like this?
Thank you very much for your patience
You can do it this way too, in which case you won't need to provide --PANELN
:
WGT ID CHR P0 P1 N CMC.BRAIN.RNASEQ/CMC.LOC643837.wgt.RDat LOC643837 1 762970 794826 452 CMC.BRAIN.RNASEQ/CMC.AGRN.wgt.RDat AGRN 1 955502 991499 452
If I just add a column N(452) to the POS file, I still get an error?
WGT ID CHR P0 P1 N CMC.BRAIN.RNASEQ/CMC.LOC643837.wgt.RDat LOC643837 1 762970 794826 452 CMC.BRAIN.RNASEQ/CMC.AGRN.wgt.RDat AGRN 1 955502 991499 452
fusion_twas-master claire$ Rscript FUSION.assoc_test.R \
--sumstats PGC2.SCZ.sumstats \ --weights ./WEIGHTS/GTEx.Whole_Blood.pos \ --weights_dir ./WEIGHTS/ \ --ref_ld_chr ./LDREF/1000G.EUR. \ --chr 22 \ --coloc_P 0.05 \ --GWASN 1234567 \ --out PGC2.SCZ.22.dat
ERROR : 'N' field needed in weights file or 'PANEL' column and --PANELN flag required for COLOC analysis
This error should only be triggered when you don't have N
in the weights file (see code below) are you sure you're using the right weights file?
https://github.com/gusevlab/fusion_twas/blob/master/FUSION.assoc_test.R#L97
Thanks very much for your reply! Is --GWANS just the patients, or the total number of patients and controls?
Best, Cheng
The total size of the study (cases and controls)
https://gusevlab.org/projects/fusion/#colocalization-analysis-with-coloc
I can't open the link, can you send it to me one more time? Thank you!
Have you successfully done it? I need your help!
Here is the link, for some reason the original contained an "https" http://gusevlab.org/projects/fusion/#colocalization-analysis-with-coloc
Rscript FUSION.assoc_test.R \
--sumstats PGC.UKB.MDD.for.fusion.sumstats \
--weights ./WEIGHTS/CMC.BRAIN.RNASEQ.pos \
--weights_dir ./WEIGHTS/ \
--ref_ld_chr ./LDREF/1000G.EUR. \
--chr 7 \
--coloc_P 0.05 \
--GWASN 1234567 \
--PANELN CMC.BRAIN.RNASEQ.profile $2 \
--out PGC.BPM2018.CMC.DLPFC.chr7.dat
I'm not sure, what does PANELN need me to type in here? I have looked through the manual carefully and still can't solve it. Can anyone help me?Thanks!
Best, CC