pveber / morse

Companion R package for MOSAIC website
7 stars 5 forks source link

Possible heisenbug in plot.survFitCstExp #261

Closed pveber closed 6 years ago

pveber commented 6 years ago

The following code:

library(morse)
data(copper)
dat <- survData(copper)
fit <- survFit(dat, model_type = 'SD')
f <- function(c) {
    pdf("rien.pdf")
    plot(fit, concentration=c)
    dev.off()
}

for(c in unique(dat$conc)) {
    f(c)
}

f(0)
f(1.25)
f(2.5)

sometimes crashes with this output:

> library(morse)
> data(copper)
> dat <- survData(copper)
> fit <- survFit(dat, model_type = 'SD')
Compiling model graph
   Resolving undeclared variables
   Allocating nodes
Graph information:
   Observed stochastic nodes: 80
   Unobserved stochastic nodes: 4
   Total graph size: 1568

Initializing model

Warning message:
The estimation of the dominant rate constant (model parameter kd)
    lies outside the range used to define its prior distribution which indicates
    that this rate is very high and difficult to estimate from this experiment ! 
> f <- function(c) {
+     pdf("rien.pdf")
+     plot(fit, concentration=c)
+     dev.off()
+ }
> 
> for(c in unique(dat$conc)) {
+     f(c)
+ }
> 
> f(0)
Erreur : Column `time` must have a unique name
Exécution arrêtée

The bug is not deterministic but typically manifests itself in one execution out of ten. Sometimes R complains about the conc column instead of time. Here the backtrace I get:

17: stop(cnd)
16: abort(paste0(...))
15: stopc(pluralise_msg("Column(s) ", vars), " ", pluralise(problem, 
        vars))
14: invalid_df("must have [a] unique name(s)", x, dups)
13: check_tibble(x)
12: list_to_tibble(x, validate, raw_rownames(x))
11: as_data_frame.data.frame(data)
10: as_data_frame(data)
9: tbl_df(.data)
8: filter(tbl_df(.data), ...)
7: as.data.frame(filter(tbl_df(.data), ...))
6: filter.data.frame(data, conc == concentration)
5: filter(data, conc == concentration)
4: survFitPlotTKTDGenericNoOnePlot(data, dobs, xlab, ylab, spaghetti, 
       adddata, concentration, addlegend)
3: survFitPlotTKTDGeneric(data.credInt, dobs, xlab, ylab, main, 
       concentration, one.plot, spaghetti, adddata, addlegend)
2: plot.survFitCstExp(list(estim.par = list(parameters = c(2L, 1L, 
[...]

So I suspect it might be related to the use of tibble... but the reason why it is not deterministic is absolutely not clear to me!

virgile-baudrot commented 6 years ago

The previous commit seems to solve the issue: the problem was the naming of columns in function 'plot.survFitCstExp' used by 'survFitPlotCITKTD_CstExp'.

Note that the problem stay for 'plot.survfitTKTD' function since 'survFitPlotCITKTD' is exactly how was 'survFitPlotCITKTD_CstExp'.

pveber commented 6 years ago

Would it be difficult to fix plot.survfitTKTD in a similar way? Maybe it's best to leave the issue open then.

virgile-baudrot commented 6 years ago

Add directly upload on the master. note: there should have only one commit, but I renamed it.

pveber commented 6 years ago

super great, thanks!