Closed billdenney closed 6 years ago
Never seen this before. But it's is in some C++ code somewhere and the model functions never get a matrix; so maybe the $OMEGA
or $SIGMA
matrices?
To start narrowing it down:
If you can:
revar(model)
vpcresult <-
model %>%
mrgsim()
vpcresult <-
model %>%
drop_re() %>%
mrgsim()
vpcresult <-
model %>%
drop_re() %>%
data_set(vpc_dataset) %>%
mrgsim()
It seems happy enough with drop_re
:
> revar(model)
$omega
$...
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26]
ETA1: 0.0732038 0.0261146 0 0 0 0 0 0 0 0 0 0 0 0.000000 0.000000 0 0 0 0 0 0 0.000000 0 0 0 0
ETA2: 0.0261146 0.0302285 0 0 0 0 0 0 0 0 0 0 0 0.000000 0.000000 0 0 0 0 0 0 0.000000 0 0 0 0
ETA3: 0.0000000 0.0000000 0 0 0 0 0 0 0 0 0 0 0 0.000000 0.000000 0 0 0 0 0 0 0.000000 0 0 0 0
ETA4: 0.0000000 0.0000000 0 0 0 0 0 0 0 0 0 0 0 0.000000 0.000000 0 0 0 0 0 0 0.000000 0 0 0 0
ETA5: 0.0000000 0.0000000 0 0 0 0 0 0 0 0 0 0 0 0.000000 0.000000 0 0 0 0 0 0 0.000000 0 0 0 0
ETA6: 0.0000000 0.0000000 0 0 0 0 0 0 0 0 0 0 0 0.000000 0.000000 0 0 0 0 0 0 0.000000 0 0 0 0
ETA7: 0.0000000 0.0000000 0 0 0 0 0 0 0 0 0 0 0 0.000000 0.000000 0 0 0 0 0 0 0.000000 0 0 0 0
ETA8: 0.0000000 0.0000000 0 0 0 0 0 0 0 0 0 0 0 0.000000 0.000000 0 0 0 0 0 0 0.000000 0 0 0 0
ETA9: 0.0000000 0.0000000 0 0 0 0 0 0 0 0 0 0 0 0.000000 0.000000 0 0 0 0 0 0 0.000000 0 0 0 0
ETA10: 0.0000000 0.0000000 0 0 0 0 0 0 0 0 0 0 0 0.000000 0.000000 0 0 0 0 0 0 0.000000 0 0 0 0
ETA11: 0.0000000 0.0000000 0 0 0 0 0 0 0 0 0 0 0 0.000000 0.000000 0 0 0 0 0 0 0.000000 0 0 0 0
ETA12: 0.0000000 0.0000000 0 0 0 0 0 0 0 0 0 0 0 0.000000 0.000000 0 0 0 0 0 0 0.000000 0 0 0 0
ETA13: 0.0000000 0.0000000 0 0 0 0 0 0 0 0 0 0 0 0.000000 0.000000 0 0 0 0 0 0 0.000000 0 0 0 0
ETA14: 0.0000000 0.0000000 0 0 0 0 0 0 0 0 0 0 0 0.411096 0.000000 0 0 0 0 0 0 0.000000 0 0 0 0
ETA15: 0.0000000 0.0000000 0 0 0 0 0 0 0 0 0 0 0 0.000000 0.107419 0 0 0 0 0 0 0.000000 0 0 0 0
ETA16: 0.0000000 0.0000000 0 0 0 0 0 0 0 0 0 0 0 0.000000 0.000000 0 0 0 0 0 0 0.000000 0 0 0 0
ETA17: 0.0000000 0.0000000 0 0 0 0 0 0 0 0 0 0 0 0.000000 0.000000 0 0 0 0 0 0 0.000000 0 0 0 0
ETA18: 0.0000000 0.0000000 0 0 0 0 0 0 0 0 0 0 0 0.000000 0.000000 0 0 0 0 0 0 0.000000 0 0 0 0
ETA19: 0.0000000 0.0000000 0 0 0 0 0 0 0 0 0 0 0 0.000000 0.000000 0 0 0 0 0 0 0.000000 0 0 0 0
ETA20: 0.0000000 0.0000000 0 0 0 0 0 0 0 0 0 0 0 0.000000 0.000000 0 0 0 0 0 0 0.000000 0 0 0 0
ETA21: 0.0000000 0.0000000 0 0 0 0 0 0 0 0 0 0 0 0.000000 0.000000 0 0 0 0 0 0 0.000000 0 0 0 0
ETA22: 0.0000000 0.0000000 0 0 0 0 0 0 0 0 0 0 0 0.000000 0.000000 0 0 0 0 0 0 0.132547 0 0 0 0
ETA23: 0.0000000 0.0000000 0 0 0 0 0 0 0 0 0 0 0 0.000000 0.000000 0 0 0 0 0 0 0.000000 0 0 0 0
ETA24: 0.0000000 0.0000000 0 0 0 0 0 0 0 0 0 0 0 0.000000 0.000000 0 0 0 0 0 0 0.000000 0 0 0 0
ETA25: 0.0000000 0.0000000 0 0 0 0 0 0 0 0 0 0 0 0.000000 0.000000 0 0 0 0 0 0 0.000000 0 0 0 0
ETA26: 0.0000000 0.0000000 0 0 0 0 0 0 0 0 0 0 0 0.000000 0.000000 0 0 0 0 0 0 0.000000 0 0 0 0
$sigma
$...
[,1] [,2] [,3]
1: 0 0 0
2: 0 0 0
3: 0 0 0
> vpcresult <-
+ model %>%
+ mrgsim()
> vpcresult <-
+ model %>%
+ drop_re() %>%
+ mrgsim()
> vpcresult <-
+ model %>%
+ drop_re() %>%
+ data_set(vpc_dataset) %>%
+ mrgsim()
Dropping non-numeric columns: C EXCLUDE_TEXT Treatment_Text USUBJID CMT_name
>
Also, a simpler simulation where I'm generating the dataset by hand works without issue (without drop_re
).
Is there a chance that I'm passing in a variable that I'm also defining in the model and that's causing the problem? (i.e. a column name in vpc_dataset
is the same as a parameter that may be calculated in the model. I'm checking that now.)
Might it have something to do with a potentially repeated ID number in the dataset?
When I change n_sub
from 1 to 2 in the code below, it gets the error. Is there anything obvious there?
## Visual Predictive Checks
n_sub <- 1
vpc_dataset <-
d %>%
filter(is.na(EXCLUDE_TEXT)) %>%
full_join(expand.grid(ID=unique(d$ID),
IDNEW=1:n_sub) %>%
mutate(IDNEW=as.numeric(factor(paste(ID, IDNEW))))) %>%
rename(ID_ORIG=ID,
ID=IDNEW) %>%
arrange(ID_ORIG, ID, TIME, desc(EVID), CMT)
vpcresult <-
model %>%
data_set(vpc_dataset) %>%
mrgsim()
P.S. I'm going for the record of most comments in the shortest time. :)
Now it's getting weird: When I remove sorting by the ID_ORIG
column, I can go up to n_sub <- 6
(n_sub=6 has no error; n_sub=7 has an error).
OK, I think I found the underlying error: some of the TIME column values were NA
. Filtering those out fixed the error.
And on top of that, I was capturing TIME
to use time after dose in one of the equations which made the differential equation results all NaN
. I'll try to make a minimally reproducible example shortly (unless this makes the fix obvious to you and you don't need more).
Ok; hrm ... I was thinking it was some issue with $OMEGA
, but now maybe it's $SIGMA
; not sure. I'll try to reproduce as well. There isn't any checking for NA
values going on; usually when it's a covariate, just fills the simulation with NA
. Wasn't aware that missing TIME
could fail so cryptically.
With this, it's raising NA or NaN to a power at one spot, and at a different point in the model, it is taking the square root of NA or NaN. These are then being multiplied by EPS(1) (or EPS(2) or EPS(3)).
As I look more, there could be a lot of math going on with NA inputs.
Another thought that may have to do with the sorting. If all data were NA and had math occur (e.g. pow()
, sqrt()
, exp()
, or one of the basic math operators +
, -
, *
, or /
) instead of just some of the data being NA could that narrow down the problem?
I couldn't get the matrix access problem; but did see this (sort of what I was expecting)
library(mrgsolve)
mod <- mread("popex", modlib())
#> Compiling popex ... done.
mod <- mrgsolve:::house()
data(exTheoph)
df <- exTheoph
df$time[c(4, 12, 33)] <- NA
mod <- omat(mod, dmat(rep(1, 4)))
mod <- smat(mod, dmat(1))
mrgsim(mod, data = df)
#> Model: housemodel
#> Dim: 132 x 7
#> Time: NA to NA
#> ID: 12
#> ID time GUT CENT RESP DV CP
#> [1,] 1 0.00 4.020000 0.0000 13.33 0.000000 0.000000
#> [2,] 1 0.25 3.191658 0.8262 13.33 0.009952 0.006478
#> [3,] 1 0.57 2.375462 1.6345 13.32 0.031590 0.012815
#> [4,] 1 NA NA NA NA NA NA
#> [5,] 1 2.02 0.623063 3.3100 13.30 0.063974 0.025953
#> [6,] 1 3.82 0.118310 3.6872 13.30 0.031993 0.028910
#> [7,] 1 5.10 0.036304 3.6754 13.30 0.064719 0.028817
#> [8,] 1 7.03 0.006114 3.5666 13.30 0.038954 0.027964
But I also saw something different with a different model
library(mrgsolve)
mod <- mread("popex", modlib())
#> Compiling popex ... done.
data(exTheoph)
df <- exTheoph
df$time[c(4, 12, 33)] <- NA
mod <- omat(mod, dmat(rep(1, 3)))
mod <- smat(mod, dmat(1))
mrgsim(mod, data = df)
#> Model: popex
#> Dim: 132 x 9
#> Time: NA to NA
#> ID: 12
#> ID time GUT CENT CL V IPRED DV ECL
#> [1,] 1 0.00 4.020 0.000 1.932 7.799 0.0000 0.0000 0.5624
#> [2,] 1 0.25 2.938 1.048 1.932 7.799 0.1343 0.1174 0.5624
#> [3,] 1 0.57 1.966 1.899 1.932 7.799 0.2435 0.0792 0.5624
#> [4,] 1 NA NA NA 1.932 7.799 NA NA 0.5624
#> [5,] 1 2.02 NA NA 1.932 7.799 NA NA 0.5624
#> [6,] 1 3.82 NA NA 1.932 7.799 NA NA 0.5624
#> [7,] 1 5.10 NA NA 1.932 7.799 NA NA 0.5624
#> [8,] 1 7.03 NA NA 1.932 7.799 NA NA 0.5624
I haven't been able to make a minimal reproducible example, but I have made something else that gives an unusual error:
code_test <- '
$PARAM THETA1=1.0, THETA2=1.0, THETA3=1.0, COVARIATE1=-1.0, COVARIATE2=-1.0
$CMT CENTRAL
$OMEGA @annotated
ETA1: 0.01: foo
$SIGMA 0
$MAIN
// Record the dosing time
if (NEWIND < 2 || EVID == 1.0 || EVID == 4.0)
double TDOSE = TIME;
double K10=THETA1*pow(-0.1, COVARIATE1);
double V1=THETA2*pow(-0.1, COVARIATE2)*exp(ETA1);
double PROP_ERR=THETA3;
$ODE
double K10TIME = K10*exp(-2*(SOLVERTIME-TDOSE));
dxdt_CENTRAL = -K10TIME*CENTRAL;
$TABLE
double CP = CENTRAL/V1;
double CPeps = CP + EPS(1)*sqrt(pow(PROP_ERR*CP, 2));
$CAPTURE CP CPeps
'
model_raw <- mcode("code_test", code_test)
model_param <- list(THETA1=1, THETA2=1, THETA3=1)
model <-
model_raw %>%
param(model_param) %>%
omat(as_bmat(0.05))
simdata <-
data.frame(ID=1,
cmt=1,
time=c(0, 0:5),
COVARIATE1=2,
COVARIATE2=3,
evid=c(1, rep(0, 6)),
amt=1)
simresult <-
model %>%
data_set(bind_rows(data.frame(ID=1, evid=1, time=NA), simdata)) %>%
mrgsim()
Error in tran_mrgsim(<S4 object of class "mrgmod">, data = c(1, 1, 1, :
vector::_M_range_check: __n (which is 4294967295) >= this->size() (which is 1)
@kylebmetrum, To help with debugging the original issue, is there a specific part of the code that I can examine to find the problem? Is there a way to narrow it down to a line number, an operation, or a variable?
@billdenney Sorry for the delay ... our internet service was out overnight and just came back.
I know where the issue is - trying to access a vector at index NA
. There needs to be a check for missing values and an error generated at least for some of these data items. I'm afraid that if(any(is.na(simdata))) stop()
is going to be too heavy handed at first since people probably pass missing data through in the data set and it has no consequence. But maybe that should be the goal eventually. I'd use it as a check in the meantime.
I'll put in some code to start requiring it for cmt
, time
, etc ...
@kylebmetrum, no worries at all on the delay. Once I got to the point that removing data with NA
time fixed my issue, it was more about helping make mrgsolve
more robust.
While I think of mrgsolve
as mostly an exceptionally useful ODE-integrating tool, I also see its value in time-independent simulations (e.g, if I'm fitting an Emax model). With that, I agree that if(any(is.na(simdata))) stop()
is too heavy-handed. And, there may be times that I want to pass in NA[1].
Warnings that NA are there in required columns seems reasonable, and giving errors if NA are in required pairs (e.g. evid %in% c(1, 4, 8)
and is.na(amt)
seems like it would be an error).
[1] Though I'd have to look up how to test for NA and run a model with NA in mrgsolve
. I've not looked, but if it doesn't exist in the docs, NA and NaN handling would be a useful topic. For me, how to test for equality to NA
within a code block (probably $MAIN
but generally in any code) would be useful.
@billdenney You can test for NA with isnan
> library(mrgsolve)
>
> code <- '
+ $PARAM FOO = 1
+
+ $MAIN
+ if(isnan(FOO)) {
+ report("Foo is not a number");
+ report("time ", TIME);
+ }
+
+ $CAPTURE FOO
+ '
>
> mod <- mcode("nan", code)
Compiling nan ... done.
>
> data <- data.frame(amt = 0, time = c(1,2,3,4), cmt = 0,
+ FOO = c(1,2,NA,4), evid = 0, ID = 1)
>
> mrgsim(mod, data = data, end = -1)
from report Foo is not a number
time 3
Model: nan
Dim: 4 x 3
Time: 1 to 4
ID: 1
ID time FOO
[1,] 1 1 1
[2,] 1 2 2
[3,] 1 3 NA
[4,] 1 4 4
It's just tricky; the parameter update will happen under the hood no matter what; so if you have some other path to take, you can do it. But there isn't an "automatic" way to get back to the last value before you see the nan etc. I dunno ... maybe this is enough to get done what you want to do.
Thanks!
I'm not sure where to start looking for the fix to the error mentioned in the subject. The full error is below (and I unfortunately can't share the code).