kgori / sigfit

Flexible Bayesian inference of mutational signatures
GNU General Public License v3.0
33 stars 8 forks source link

Error running second vignette #54

Closed kmegq closed 3 years ago

kmegq commented 3 years ago

Hello, I am getting an error and some warnings when trying to run the second vignette:

My commands were:

library(sigfit)
data("variants_21breast")
counts_21breast <- build_catalogues(variants_21breast)
mcmc_samples_extr <- extract_signatures(counts = counts_21breast, nsignatures = 2:7, iter = 1000, seed = 1756)

After going through the iterations, I got the following:

Error in plot.new() : figure margins too large
In addition: There were 20 warnings (use warnings() to see them)
> warnings()
Warning messages:
1: There were 5 divergent transitions after warmup. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.
2: Examine the pairs() plot to diagnose sampling problems

3: The largest R-hat is 1.13, indicating chains have not mixed.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#r-hat
4: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#bulk-ess
5: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#tail-ess
6: The largest R-hat is 1.14, indicating chains have not mixed.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#r-hat
7: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#bulk-ess
8: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#tail-ess
9: The largest R-hat is 1.05, indicating chains have not mixed.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#r-hat
10: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#bulk-ess
11: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#tail-ess
12: The largest R-hat is 1.18, indicating chains have not mixed.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#r-hat
13: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#bulk-ess
14: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#tail-ess
15: The largest R-hat is 1.12, indicating chains have not mixed.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#r-hat
16: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#bulk-ess
17: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#tail-ess
18: The largest R-hat is 1.31, indicating chains have not mixed.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#r-hat
19: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#bulk-ess
20: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#tail-ess

Am I doing something wrong?

Thank you for your help!

Best, Kate

kgori commented 3 years ago

Hi Kate,

The plotting error has happened because sigfit automatically produces a goodness-of-fit plot after extracting a range of numbers of signatures. It looks like the plot was too big for your plotting environment - maybe you're on Rstudio, and the plot window is quite small? If so, making the window bigger should fix this. However, there's no good reason for extract_signatures to fail with an error if the plot goes wrong, especially after spending a long time sampling, so I'll fix this in the next update.

The warnings are from the Stan library. The URL in the warning message explains it better than I can. In short, they are telling you that the MCMC sampler had some difficulty exploring the parameter space. Generally, seeing a few of these warnings when running sigfit is nothing to worry about. A couple of things can help reduce the number of warnings, at the expense of longer run-times: adding an extra argument extract_signatures(..., control = list(adapt_delta = 0.99)) can help reduce divergent transitions, and increasing the number of iterations can help with the RHat warnings.

Best, Kevin

kmegq commented 3 years ago

Thank you, Kevin!

The issue was that I am doing this work on our servers, which don't have a graphical interface. I was able to fix the problem by setting:

options(device=pdf)

As suggested in this Stackoverflow post.

Best, Kate

kgori commented 3 years ago

Ah, yes, that would fix it! Glad you got it working. Kevin