precimed / mixer

Causal Mixture Model for GWAS summary statistics
GNU General Public License v3.0
59 stars 17 forks source link

mixer bivariate visualization #40

Open jcfoo opened 3 years ago

jcfoo commented 3 years ago

Hi,

When I try to generate figures using mixer_figures.py two, after a seemingly successful 'fit2' and 'test2' after already calculated univariate results, the message:

--json argument does not contain brute1 optimization results, skip likelihood plot generation

And the last likelihood plot panel is not generated.
I believe I followed the readme steps exactly, and everything else seems to work-- am I missing something simple? I read in another issue that the brute1 optimization is specified by default, but I'm not really sure what is going on. Thank you in advance for your help!

jcfoo commented 3 years ago

Anyone have any idea about this?

ofrei commented 3 years ago

@jcfoo hi, could you please copy the full command you're running which gives this error?

jcfoo commented 3 years ago

Hi! Thanks for the reply!

This is what happens when I run a simple line like this:

python3 mixer_figures.py two --json abcd.test.json --out efgh_result

Am I missing something? Thanks in advance!

ofrei commented 3 years ago

@jcfoo sorry about this, it was an inaccuracy in the readme file, now it is corrected. You need this command with both --json-fit and --json-test:

python3 mixer_figures.py two --json-fit abcd.fit.json --json-test abcd.test.json --out efgh_result

See this diff for more info: https://github.com/precimed/mixer/commit/1aafcd10fb7ab86d0836ce254e2e804b0254ae25 Also, consider checking mixer_simu.md and mixer_real.md for more usage examples . I made them just few weeks ago.

Sorry for very slow response to your question, I see it's been around since February, that's too bad.

jcfoo commented 3 years ago

@ofrei No problem at all, I was wondering if the question just was missed.

Thanks so much for the reply. That's really helpful and I will check all those files out. Cheers!

jcfoo commented 3 years ago

Hi @ofrei, the code you specified enabled the output of the likelihood graph. Thanks!

I'm now faced with another issue. I see on your latest examples there is also an indication of variance on the Venn diagrams. I would like to include this and tried tried using the line:

python3 mixer_figures.py two --json-fit AB.fit.json --json-test AB.test.json --out AB --trait1 A --trait2 B --statistic mean std

However I get an error as follows:

error reading from AB.fit.json, skip error reading from AB.test.json, skip Done. Traceback (most recent call last): File "mixer_figures.py", line 15, in args.func(args) File "/home/dire/mixer/precimed/mixer/figures.py", line 472, in execute_two_parser plt.subplot(1,4,1); make_venn_plot(data_fit, flip=args.flip, traits=[args.trait1, args.trait2], colors=[args.trait1_color, args.trait2_color], statistic=args.statistic) File "/home/dire/mixer/precimed/mixer/figures.py", line 75, in make_venn_plot n1 = data['ci']['nc1@p9'][statistic[0]]/scale_factor; n1_se = data['ci']['nc1@p9'][statistic[1]]/scale_factor if with_std else None KeyError: 'mean'

Am I missing something here or specifying it incorrectly?

Also another question which I could not find an answer to is how do you shrink the legend on the third panel of the figure? Thanks in advance!

ofrei commented 3 years ago

@jcfoo In mixer v1.3 standard errors are computed by bootstrap-like procedure running the analysis 20 times using different subsets of SNPs during the fit procedure, and computing standard errors across 20 point estimates from these runs. In this procedure you also need to run mixer_figures.py combine step prior to mixer_figures.py one/mixer_figures.py two.

To run 20 times I typically use SLURM's --array feature as e.g. here https://github.com/precimed/mixer/blob/master/scripts/xsede_snps_script.sh#L12 also another example here https://github.com/comorment/containers/blob/main/usecases/mixer_real/MIXER_REAL.job .

Does this help or are you already following this procedure?

jcfoo commented 3 years ago

@ofrei

Thanks. I did not update to mixer 1.3 yet, as I just found out about it. I had completed all analyses already and did not have the time yet to update and redo everything, I just needed to figure out the visualizations (the SE, legend, and missing panel).

So, I did not follow those procedures you mentioned yet. Would you recommend redoing everything after updating to mixer 1.3?

ofrei commented 3 years ago

Here is a description of what's changing between v1.2 and v1.3 https://github.com/precimed/mixer#upgrade-nodes-from-mixer-v12-to-v13 The code of the MiXeR is literally the same, it's only the procedure that has changed - instead of constraining to HapMap3 SNPs, the new recommendation to use 20 pre-defined set of SNPs as described in the above link.

Generally v1.3 is more robust - by running 20 times you'll see if point estimates change too much across runs. This woudl indicate that GWAS summary statistics have insufficient sample size to power MiXeR analysis. This is also reflected in standard errors. But for well powered analysis changing from v1.2 to v1.3 shouldn't make any noticeable difference.

If you have CPU hours to re-run I would suggest to do so just to be safe.

jcfoo commented 3 years ago

Thanks so much. I'll see if it is feasible to redo it all to be on the safe side. The extra robustness will definitely be better for the analysis, even if in the end the results do not change much. With respect to the visualization of the size of the legend, is that automatically adjusted as well in v 1.3?

jcfoo commented 3 years ago

I'm trying now to start everything from the beginning in v 1.3.

When I get to the step to generate: 1000G.EUR.QC.prune_maf0p05_rand2M_r2p8.repNN.snps

I tried to use the command provided in the readme:

--maf 0.05 --subset 2000000 --r2 0.8 --seed ${SLURM_ARRAY_TASK_ID} \

but get the message:

mixer.py snps: error: argument --seed: expected one argument

Am I doing something wrong? As always, thanks so much!

ofrei commented 3 years ago

@jcfoo I'm not sure, --seed: expected one argument suggests that ${SLURM_ARRAY_TASK_ID} is empty i.e. there is no --array of the slurm command. Either way you could download all .snps files from here: https://github.com/comorment/containers/tree/main/reference/ldsc/1000G_EUR_Phase3_plink (scroll towards the end of the page)

jcfoo commented 2 years ago

Hi @ofrei , thanks for all your previous help. I managed to get everything up and running finally with SLURM, it turned out our IT department had blocked some things. I'm in the process of rerunning my analyses and at the visualization step, one of my bivariate visualization gives me this:

RuntimeWarning: All-NaN slice encountered plot_y[i, :] = plot_y[i, :] - np.nanmin(plot_y[i, :])

but still goes on to generate the .png file. Do you know what might be happening with this? Thanks in advance for your help!