Problem of Fusion "coloc"

zhangcc89claire commented 3 years ago

Rscript FUSION.assoc_test.R \

--sumstats PGC.UKB.MDD.for.fusion.sumstats \

--weights ./WEIGHTS/CMC.BRAIN.RNASEQ.pos \

--weights_dir ./WEIGHTS/ \

--ref_ld_chr ./LDREF/1000G.EUR. \

--chr 7 \

--coloc_P 0.05 \

--GWASN 1234567 \

--PANELN CMC.BRAIN.RNASEQ.profile $2 \

--out PGC.BPM2018.CMC.DLPFC.chr7.dat

ERROR : 'N' field needed in weights file or 'PANEL' column and --PANELN flag required for COLOC analysis

I'm not sure, what does PANELN need me to type in here? I have looked through the manual carefully and still can't solve it. Can anyone help me？Thanks!

Best, CC

sashagusev commented 3 years ago

Hi, the colocalization analysis needs the sample size for the gene expression / weights data. You can see the sample size for the public weights here (http://gusevlab.org/projects/fusion/#reference-functional-data). You can include this sample size in two ways:

Adding an "N" column to the --weights file that lists the sample size for each weight.
Adding a "PANEL" column to the --weights file that lists the name of the expression dataset, and then providing a --PANELN [file] flag to the analysis where [file] looks like this:
```
PANEL N
name1 size1
name2 size2
name3 size3
```

See also the documentation here: https://gusevlab.org/projects/fusion/#colocalization-analysis-with-coloc

zhangcc89claire commented 3 years ago

Hi, the colocalization analysis needs the sample size for the gene expression / weights data. You can see the sample size for the public weights here (http://gusevlab.org/projects/fusion/#reference-functional-data). You can include this sample size in two ways:

Adding an "N" column to the --weights file that lists the sample size for each weight.

Adding a "PANEL" column to the --weights file that lists the name of the expression dataset, and then providing a --PANELN [file] flag to the analysis where [file] looks like this:
PANEL N
name1 size1
name2 size2
name3 size3
See also the documentation here: https://gusevlab.org/projects/fusion/#colocalization-analysis-with-coloc

Does PANELN [file] looks like this?

PANEL N gene1.wgt.data 452 gene2.wgt.data 452 gene3.wgt.data 452

Thanks!

Best, CC

sashagusev commented 3 years ago

Sorry for the confusion, the PANELN files should look like:

PANEL N
CMC 452

and then the --weights file should have a column PANEL with the value CMC for all individuals.

zhangcc89claire commented 3 years ago

--weights file should have a column PANEL with the value CMC for all individuals

WGT ID CHR P0 P1 N CMC.BRAIN.RNASEQ/CMC.LOC643837.wgt.RDat LOC643837 1 762970 794826 452 CMC.BRAIN.RNASEQ/CMC.AGRN.wgt.RDat AGRN 1 955502 991499 452

Should I add the last column in the POS file like this？

zhangcc89claire commented 3 years ago

Thank you very much for your patience

sashagusev commented 3 years ago

You can do it this way too, in which case you won't need to provide --PANELN:

WGT ID CHR P0 P1 N CMC.BRAIN.RNASEQ/CMC.LOC643837.wgt.RDat LOC643837 1 762970 794826 452 CMC.BRAIN.RNASEQ/CMC.AGRN.wgt.RDat AGRN 1 955502 991499 452

zhangcc89claire commented 3 years ago

If I just add a column N（452） to the POS file, I still get an error?

WGT ID CHR P0 P1 N CMC.BRAIN.RNASEQ/CMC.LOC643837.wgt.RDat LOC643837 1 762970 794826 452 CMC.BRAIN.RNASEQ/CMC.AGRN.wgt.RDat AGRN 1 955502 991499 452

fusion_twas-master claire$ Rscript FUSION.assoc_test.R \

--sumstats PGC2.SCZ.sumstats \ --weights ./WEIGHTS/GTEx.Whole_Blood.pos \ --weights_dir ./WEIGHTS/ \ --ref_ld_chr ./LDREF/1000G.EUR. \ --chr 22 \ --coloc_P 0.05 \ --GWASN 1234567 \ --out PGC2.SCZ.22.dat

ERROR : 'N' field needed in weights file or 'PANEL' column and --PANELN flag required for COLOC analysis

sashagusev commented 3 years ago

This error should only be triggered when you don't have N in the weights file (see code below) are you sure you're using the right weights file?

https://github.com/gusevlab/fusion_twas/blob/master/FUSION.assoc_test.R#L97

zhangcc89claire commented 3 years ago

Thanks very much for your reply! Is --GWANS just the patients, or the total number of patients and controls?

Best, Cheng

sashagusev commented 3 years ago

The total size of the study (cases and controls)