Closed wonmy076 closed 1 year ago
Hello @wonmy076, Yes it is weird. I also see that ERRPROP does not change either, and IPRE is null. Can I ask you to post a bit of code so that I can reproduce this please, and see what is wrong? Ideally in a form of a reprex like this:
library(mrgsolve)
#>
#> Attachement du package : 'mrgsolve'
#> L'objet suivant est masqué depuis 'package:stats':
#>
#> filter
cod <- "
$OMEGA 0.1 0.1
$SIGMA 0.02 0.1
$CMT CENT
$MAIN
double ETACL = ETA(1) ;
double ETAV = ETA(2) ;
double CL = 1 * exp(ETACL) ;
double V = 30 * exp(ETAV) ;
$PKMODEL ncmt = 1, depot = FALSE
$TABLE
double ERRPROP = EPS(1) ;
double ERRADD = EPS(2) ;
double IPRED = CENT / V ;
double DV = IPRED * (1 + ERRPROP) + ERRADD ;
$CAPTURE
ETACL ETAV ERRPROP ERRADD IPRED DV
"
mod <- mcode("mod", cod)
#> Building mod ...
#> done.
set.seed(1234)
mrgsim_df(mod, tgrid = 0)
#> ID time CENT ETACL ETAV ERRPROP ERRADD IPRED DV
#> 1 1 0 0 -0.2764698 0.08752883 0.1975575 0.4987714 0 0.4987714
set.seed(1234)
mrgsim_df(mod, tgrid = 0)
#> ID time CENT ETACL ETAV ERRPROP ERRADD IPRED DV
#> 1 1 0 0 -0.2764698 0.08752883 0.1975575 0.4987714 0 0.4987714
Created on 2022-08-31 by the reprex package (v2.0.1)
Félicien
library(mrgsolve)
#>
#> Attaching package: 'mrgsolve'
#> The following object is masked from 'package:stats':
#>
#> filter
code <- '
$INIT
CMT1 = 0,
CMT2 = 0,
AUC = 0,
$SET
atol = 1e-8, rtol = 1e-8
maxsteps = 100000
$PARAM
POPCL = exp(1.67),
POPV1 = exp(3.75),
POPV2 = exp(3.73),
POPQ = exp(1.17),
MAT50_CL = exp(-0.113),
GAMMA1 = exp(1.06),
DEC50_CL = exp(4.12),
GAMMA2 = -exp(0.806),
FSCR = exp(-0.436),
CANCER_CL = 0.293,
HEEL_V1 = 0.313,
HEEL_Q = 0.595,
// Default Covariate values for Simulation
AGE = 35,
GA = 40,
WT = 70,
SCR = 0.84,
HEEL = 0,
CANCER = 0,
INFDUR = 1,
$OMEGA
@labels PPVCL, PPVV1, PPVV2
0.0749 // PPV on CL
0.0721 // PPV on V1
0.672 // PPV on V2
$SIGMA
@labels ERRPROP, ERRADD
0.046
1.52
$MAIN
double PMA = AGE + GA/52;
double PMA_REF = 35 + (40.0/52.0);
double WT_NORM = WT/70;
double FMAT = pow(PMA,GAMMA1)/(pow(PMA,GAMMA1)+pow(MAT50_CL,GAMMA1));
double FMAT_REF = pow(PMA_REF,GAMMA1)/(pow(PMA_REF,GAMMA1)+pow(MAT50_CL,GAMMA1));
double MAT_CL = FMAT/FMAT_REF;
double FDEC = pow(PMA,GAMMA2)/(pow(PMA,GAMMA2)+pow(DEC50_CL,GAMMA2));
double FDEC_REF = pow(PMA_REF,GAMMA2)/(pow(PMA_REF,GAMMA2)+pow(DEC50_CL,GAMMA2));
double DEC_CL = FDEC/FDEC_REF;
double SCR_REF = exp(-1.22839 + log10(PMA)*0.67190 + 6.27017 *exp(-3.10940*PMA));
double SCR_CL = exp(-FSCR*(SCR - SCR_REF));
double CL = POPCL * pow((V1/POPV1),0.75) * MAT_CL * DEC_CL * SCR_CL * (1+CANCER*CANCER_CL) * exp(PPVCL);
double V1 = POPV1 * WT_NORM * (1-HEEL*HEEL_V1) * exp(PPVV1);
double V2 = POPV2 * WT_NORM * exp(PPVV2);
double Q = POPQ * pow((V2/POPV2),0.75) * (1-HEEL*HEEL_Q);
D_CMT1 = INFDUR;
$ODE
double C1 = CMT1/V1;
double C2 = CMT2/V2;
dxdt_CMT1 = - C1*Q + C2*Q - C1*CL;
dxdt_CMT2 = C1*Q - C2*Q;
dxdt_AUC = C1;
$TABLE
double IPRE = C1;
double DV = IPRE + IPRE*ERRPROP + ERRADD;
$CAPTURE
CL V1 V2 Q PPVCL PPVV1 PPVV2 ERRPROP ERRADD IPRE DV WT AGE GA PMA SCR HEEL CANCER INFDUR
'
# compile the model code
mod <- mcode("PopPKModel",code)
#> Building PopPKModel ...
#> done.
set.seed(123456)
mrgsim_df(mod, tgrid = 0)
#> ID time CMT1 CMT2 AUC CL V1 V2 Q PPVCL PPVV1
#> 1 1 0 0 0 0 0 52.07243 16.82352 1.631636 0.1758639 0.2026356
#> PPVV2 ERRPROP ERRADD IPRE DV WT AGE GA PMA SCR HEEL
#> 1 -0.9072221 -0.1146646 -1.433469 0 -1.433469 70 35 40 35.76923 0.84 0
#> CANCER INFDUR
#> 1 0 1
set.seed(123456)
mrgsim_df(mod, tgrid = 0)
#> ID time CMT1 CMT2 AUC CL V1 V2 Q PPVCL PPVV1
#> 1 1 0 0 0 0 7.332455 52.07243 16.82352 1.631636 0.1758639 0.2026356
#> PPVV2 ERRPROP ERRADD IPRE DV WT AGE GA PMA SCR HEEL
#> 1 -0.9072221 -0.1146646 -1.433469 0 -1.433469 70 35 40 35.76923 0.84 0
#> CANCER INFDUR
#> 1 0 1
Created on 2022-08-31 with reprex v2.0.2
This is my first time using replex so hope this works for you. Thanks
Thanks,
I can reproduce that: if set.seed()
and mrgsim()
are called right after model compiling, a clearance of zero is returned. In every other situations, the right values are returned. I don't know if this has the same source of error as in your first post (where ERRADD was changed, not CL). It is not for every model: my previous reprex example did not show this behaviour.
mod <- mcode("PopPKModel",code)
#> Building PopPKModel ...
#> done.
set.seed(123456)
mrgsim_df(mod, tgrid = 0, Req = "CL, PPVCL, ERRPROP, ERRADD")
#> ID time CL PPVCL ERRPROP ERRADD
#> 1 1 0 0 0.1758639 -0.1146646 -1.433469
set.seed(123456)
mrgsim_df(mod, tgrid = 0, Req = "CL, PPVCL, ERRPROP, ERRADD")
#> ID time CL PPVCL ERRPROP ERRADD
#> 1 1 0 7.332455 0.1758639 -0.1146646 -1.433469
set.seed(123456)
mrgsim_df(mod, tgrid = 0, Req = "CL, PPVCL, ERRPROP, ERRADD")
#> ID time CL PPVCL ERRPROP ERRADD
#> 1 1 0 7.332455 0.1758639 -0.1146646 -1.433469
mod <- mcode("PopPKModel",code)
#> Building PopPKModel ...
#> (waiting) ...
#> done.
set.seed(123456)
mrgsim_df(mod, tgrid = 0, Req = "CL, PPVCL, ERRPROP, ERRADD")
#> ID time CL PPVCL ERRPROP ERRADD
#> 1 1 0 0 0.1758639 -0.1146646 -1.433469
set.seed(123456)
mrgsim_df(mod, tgrid = 0, Req = "CL, PPVCL, ERRPROP, ERRADD")
#> ID time CL PPVCL ERRPROP ERRADD
#> 1 1 0 7.332455 0.1758639 -0.1146646 -1.433469
Created on 2022-08-31 by the reprex package (v2.0.1)
For now, I cannot say what is wrong there.
@wonmy076 -
Can you check these lines?
double CL = POPCL * pow((V1/POPV1),0.75) * MAT_CL * DEC_CL * SCR_CL * (1+CANCER*CANCER_CL) * exp(PPVCL);
double V1 = POPV1 * WT_NORM * (1-HEEL*HEEL_V1) * exp(PPVV1);
If this is what you intended, then you'll have to define V1
before CL
.
@wonmy076 -
Can you check these lines?
double CL = POPCL * pow((V1/POPV1),0.75) * MAT_CL * DEC_CL * SCR_CL * (1+CANCER*CANCER_CL) * exp(PPVCL); double V1 = POPV1 * WT_NORM * (1-HEEL*HEEL_V1) * exp(PPVV1);
If this is what you intended, then you'll have to define
V1
beforeCL
.
Thank you. That has fixed the issue of CL being 0 in the first simulation. But it hasn't fixed the issue of additive error being different in each output after each simulation, and DVs being different.
Ok; could you please update your example?
I'm getting this on 1.0.4
; also reproducible with 1.0.6
(recently uploaded to CRAN).
library(mrgsolve)
#>
#> Attaching package: 'mrgsolve'
#> The following object is masked from 'package:stats':
#>
#> filter
code <- '
$INIT
CMT1 = 0,
CMT2 = 0,
AUC = 0,
$SET
atol = 1e-8, rtol = 1e-8
maxsteps = 100000
$PARAM
POPCL = exp(1.67),
POPV1 = exp(3.75),
POPV2 = exp(3.73),
POPQ = exp(1.17),
MAT50_CL = exp(-0.113),
GAMMA1 = exp(1.06),
DEC50_CL = exp(4.12),
GAMMA2 = -exp(0.806),
FSCR = exp(-0.436),
CANCER_CL = 0.293,
HEEL_V1 = 0.313,
HEEL_Q = 0.595,
// Default Covariate values for Simulation
AGE = 35,
GA = 40,
WT = 70,
SCR = 0.84,
HEEL = 0,
CANCER = 0,
INFDUR = 1,
$OMEGA
@labels PPVCL, PPVV1, PPVV2
0.0749 // PPV on CL
0.0721 // PPV on V1
0.672 // PPV on V2
$SIGMA
@labels ERRPROP, ERRADD
0.046
1.52
$MAIN
double PMA = AGE + GA/52;
double PMA_REF = 35 + (40.0/52.0);
double WT_NORM = WT/70;
double FMAT = pow(PMA,GAMMA1)/(pow(PMA,GAMMA1)+pow(MAT50_CL,GAMMA1));
double FMAT_REF = pow(PMA_REF,GAMMA1)/(pow(PMA_REF,GAMMA1)+pow(MAT50_CL,GAMMA1));
double MAT_CL = FMAT/FMAT_REF;
double FDEC = pow(PMA,GAMMA2)/(pow(PMA,GAMMA2)+pow(DEC50_CL,GAMMA2));
double FDEC_REF = pow(PMA_REF,GAMMA2)/(pow(PMA_REF,GAMMA2)+pow(DEC50_CL,GAMMA2));
double DEC_CL = FDEC/FDEC_REF;
double SCR_REF = exp(-1.22839 + log10(PMA)*0.67190 + 6.27017 *exp(-3.10940*PMA));
double SCR_CL = exp(-FSCR*(SCR - SCR_REF));
double V1 = POPV1 * WT_NORM * (1-HEEL*HEEL_V1) * exp(PPVV1);
double V2 = POPV2 * WT_NORM * exp(PPVV2);
double Q = POPQ * pow((V2/POPV2),0.75) * (1-HEEL*HEEL_Q);
double CL = POPCL * pow((V1/POPV1),0.75) * MAT_CL * DEC_CL * SCR_CL * (1+CANCER*CANCER_CL) * exp(PPVCL);
D_CMT1 = INFDUR;
$ODE
double C1 = CMT1/V1;
double C2 = CMT2/V2;
dxdt_CMT1 = - C1*Q + C2*Q - C1*CL;
dxdt_CMT2 = C1*Q - C2*Q;
dxdt_AUC = C1;
$TABLE
double IPRE = C1;
double DV = IPRE + IPRE*ERRPROP + ERRADD;
$CAPTURE
CL V1 V2 Q PPVCL PPVV1 PPVV2 ERRPROP ERRADD IPRE DV WT AGE GA PMA SCR HEEL CANCER INFDUR
'
mod <- mcode("PopPKModel",code)
#> Building PopPKModel ...
#> done.
set.seed(123456)
a <- mrgsim_df(mod, tgrid = 0)
a
#> ID time CMT1 CMT2 AUC CL V1 V2 Q PPVCL PPVV1
#> 1 1 0 0 0 0 7.332455 52.07243 16.82352 1.631636 0.1758639 0.2026356
#> PPVV2 ERRPROP ERRADD IPRE DV WT AGE GA PMA SCR HEEL
#> 1 -0.9072221 -0.1146646 -1.433469 0 -1.433469 70 35 40 35.76923 0.84 0
#> CANCER INFDUR
#> 1 0 1
set.seed(123456)
b <- mrgsim_df(mod, tgrid = 0)
b
#> ID time CMT1 CMT2 AUC CL V1 V2 Q PPVCL PPVV1
#> 1 1 0 0 0 0 7.332455 52.07243 16.82352 1.631636 0.1758639 0.2026356
#> PPVV2 ERRPROP ERRADD IPRE DV WT AGE GA PMA SCR HEEL
#> 1 -0.9072221 -0.1146646 -1.433469 0 -1.433469 70 35 40 35.76923 0.84 0
#> CANCER INFDUR
#> 1 0 1
a$DV
#> [1] -1.433469
b$DV
#> [1] -1.433469
set.seed(1234567)
c <- mrgsim_df(mod, tgrid = 0)
set.seed(1234567)
d <- mrgsim_df(mod, tgrid = 0)
c$DV
#> [1] 1.451737
d$DV
#> [1] 1.451737
identical(a,b)
#> [1] TRUE
identical(c,d)
#> [1] TRUE
Created on 2022-09-01 by the reprex package (v2.0.1)
@kylebaron
Could it be because my input_df2 for simulation 2 has a different number of rows than input_df1 for simulation 1? This is because there are more doses being given in the input_df2 than df1.
Hi @wonmy076 -
I didn't see input_df2
or input_df1
in the example you gave. That is something we would have checked if it was there. I assumed that the example you gave reproduced the issue you were seeing.
If you want to go ahead and post an example that actually reproduces the issue you are seeing, that would be the next step. But as you noted, the simulation is only reproducible for identical inputs.
Kyle
Here is an example where I get reproducibility (and correctness) issues. When the total number of output time samples (across all patients) is greater than about 32000 (which makes me suspicious that it's actually 32768 = 2**15
), the PROP
errors are unreproducible and do not have the requested variance. When the number of patients or time sampling is adjusted to reduce that total number to below that threshold, I usually get reproducible results. That is, when the number of samples is just under the threshold, most runs will reproduce the same correct-variance results in detail, but sometimes the repeated runs will have higher variances and they will be different each time.
library(mrgsolve)
library(ggplot2)
library(dplyr)
set.seed(12345)
# Try 15 to get the expected, reproducible results back
# Try 16 for the semi-reproducible results
n_patients <- 100
# Try 12 to get the expected, reproducible results back
# Try 13 for the semi-reproducible results
n_days <- 85
# Try 8 to get the expected, reproducible results back
delta <- 1
# Super-simple model
code <- '
$PARAM CL = 1, V = 25
$PKMODEL cmt = "CENT"
$SIGMA @labels PROP
0.025
$TABLE
capture IPRED = CENT/V;
capture DV = IPRED*(1+PROP);
capture PROP = PROP;
'
mod <- mcode("sigma-prop", code)
data <- expand.ev(ID=1:n_patients, amt=50, ii=24, addl=85)
idata <-
data.frame(ID=1:n_patients)
out<-mod %>%
data_set(data) %>%
idata_set(idata) %>%
obsonly() %>%
mrgsim(start=0,end=24*n_days,delta=delta) %>%
as.data.frame()
length(out$PROP)
var(out$PROP)
expected_var <- as.numeric(smat(mod)@data)
gbprop <- out %>% group_by(ID) %>% summarise(varPROP=var(PROP))
g<-ggplot(data=gbprop)+
geom_line(aes(ID, varPROP))+
geom_hline(yintercept=expected_var);g
Hi @rkern -
Is the code you posted producing the expected output? or not producing expected output?
I'm getting this from your code
library(mrgsolve)
#>
#> Attaching package: 'mrgsolve'
#> The following object is masked from 'package:stats':
#>
#> filter
library(ggplot2)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
set.seed(12345)
# Try 15 to get the expected, reproducible results back
# Try 16 for the semi-reproducible results
n_patients <- 100
# Try 12 to get the expected, reproducible results back
# Try 13 for the semi-reproducible results
n_days <- 85
# Try 8 to get the expected, reproducible results back
delta <- 1
# Super-simple model
code <- '
$PARAM CL = 1, V = 25
$PKMODEL cmt = "CENT"
$SIGMA @labels PROP
0.025
$TABLE
capture IPRED = CENT/V;
capture DV = IPRED*(1+PROP);
capture PROP = PROP;
'
mod <- mcode("sigma-prop", code)
#> Building sigma-prop ...
#> done.
data <- expand.ev(ID=1:n_patients, amt=50, ii=24, addl=85)
idata <-
data.frame(ID=1:n_patients)
set.seed(12345)
out1 <-mod %>%
data_set(data) %>%
idata_set(idata) %>%
obsonly() %>%
mrgsim(start=0,end=24*n_days,delta=delta) %>%
as.data.frame()
set.seed(12345)
out2 <-mod %>%
data_set(data) %>%
idata_set(idata) %>%
obsonly() %>%
mrgsim(start=0,end=24*n_days,delta=delta) %>%
as.data.frame()
identical(out1, out2)
#> [1] TRUE
length(out1$PROP)
#> [1] 204100
var(out1$PROP)
#> [1] 0.02498433
var(out2$PROP)
#> [1] 0.02498433
expected_var <- as.numeric(smat(mod)@data)
MASS::mvrnorm
set.seed(3392)
mass <- MASS::mvrnorm(nrow(out1), 0, as.matrix(smat(mod))) %>% as.numeric()
length(mass)
#> [1] 204100
var(mass)
#> [1] 0.02487356
var(out1$PROP)
#> [1] 0.02498433
smat(mod)@data
#> $...
#> [,1]
#> [1,] 0.025
Created on 2022-10-12 with reprex v2.0.2
My script returns the unexpected results for me (R 4.1.2, mrgsolve 1.0.6).
> var(out$PROP)
[1] 2.12869
Ah, MASS:mvrnorm()
is giving the wrong results. I have MASS 7.3-55
> set.seed(3392)
> mass <- MASS::mvrnorm(nrow(out), 0, as.matrix(smat(mod))) %>% as.numeric()
> length(mass)
[1] 204100
> var(mass)
[1] 2.470504
> var(out$PROP)
[1] 2.12869
> smat(mod)@data
$...
[,1]
[1,] 0.025
Oh wow ... yeah, that's way different. The mrgsolve implementation for generating EPS follows what they do in MASS
and looks like that's consistent with rnorm too on my setup.
Can you send your sessionInfo()
? Can't imagine what is going on here.
Kyle
library(mrgsolve)
#>
#> Attaching package: 'mrgsolve'
#> The following object is masked from 'package:stats':
#>
#> filter
library(ggplot2)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
set.seed(12345)
# Try 15 to get the expected, reproducible results back
# Try 16 for the semi-reproducible results
n_patients <- 100
# Try 12 to get the expected, reproducible results back
# Try 13 for the semi-reproducible results
n_days <- 85
# Try 8 to get the expected, reproducible results back
delta <- 1
# Super-simple model
code <- '
$PARAM CL = 1, V = 25
$PKMODEL cmt = "CENT"
$SIGMA @labels PROP
0.025
$TABLE
capture IPRED = CENT/V;
capture DV = IPRED*(1+PROP);
capture PROP = PROP;
'
mod <- mcode("sigma-prop", code)
#> Building sigma-prop ...
#> done.
data <- expand.ev(ID=1:n_patients, amt=50, ii=24, addl=85)
idata <-
data.frame(ID=1:n_patients)
set.seed(12345)
out1 <-mod %>%
data_set(data) %>%
idata_set(idata) %>%
obsonly() %>%
mrgsim(start=0,end=24*n_days,delta=delta) %>%
as.data.frame()
set.seed(12345)
out2 <-mod %>%
data_set(data) %>%
idata_set(idata) %>%
obsonly() %>%
mrgsim(start=0,end=24*n_days,delta=delta) %>%
as.data.frame()
identical(out1, out2)
#> [1] TRUE
length(out1$PROP)
#> [1] 204100
var(out1$PROP)
#> [1] 0.02498433
var(out2$PROP)
#> [1] 0.02498433
expected_var <- as.numeric(smat(mod)@data)
MASS::mvrnorm
set.seed(3392)
mass <- MASS::mvrnorm(nrow(out1), 0, as.matrix(smat(mod))) %>% as.numeric()
length(mass)
#> [1] 204100
var(mass)
#> [1] 0.02487356
var(out1$PROP)
#> [1] 0.02498433
var(rnorm(nrow(out1), 0, sqrt(as.numeric(as.matrix(smat(mod))))))
#> [1] 0.02499107
smat(mod)@data
#> $...
#> [,1]
#> [1,] 0.025
sessionInfo()
#> R version 4.1.2 (2021-11-01)
#> Platform: x86_64-apple-darwin17.0 (64-bit)
#> Running under: macOS Big Sur 10.16
#>
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
#>
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] dplyr_1.0.9 ggplot2_3.3.6 mrgsolve_1.0.6.9000
#>
#> loaded via a namespace (and not attached):
#> [1] Rcpp_1.0.9 pillar_1.8.1 compiler_4.1.2 highr_0.9
#> [5] R.methodsS3_1.8.2 R.utils_2.12.0 tools_4.1.2 digest_0.6.29
#> [9] gtable_0.3.0 evaluate_0.16 lifecycle_1.0.1 tibble_3.1.8
#> [13] R.cache_0.16.0 pkgconfig_2.0.3 rlang_1.0.5 reprex_2.0.2
#> [17] DBI_1.1.3 cli_3.3.0 rstudioapi_0.14 yaml_2.3.5
#> [21] xfun_0.32 fastmap_1.1.0 withr_2.5.0 styler_1.7.0
#> [25] stringr_1.4.1 knitr_1.40 generics_0.1.3 fs_1.5.2
#> [29] vctrs_0.4.1 grid_4.1.2 tidyselect_1.1.2 glue_1.6.2
#> [33] R6_2.5.1 fansi_1.0.3 rmarkdown_2.16 purrr_0.3.4
#> [37] magrittr_2.0.3 MASS_7.3-58.1 scales_1.2.1 htmltools_0.5.3
#> [41] assertthat_0.2.1 colorspace_2.0-3 utf8_1.2.2 stringi_1.7.8
#> [45] munsell_0.5.0 R.oo_1.25.0
Created on 2022-10-12 with reprex v2.0.2
packageVersion("MASS") [1] ‘7.3.58.1’
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Pop!_OS 22.04 LTS
Matrix products: default
BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/libmkl_rt.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] MASS_7.3-55 reprex_2.0.2 dplyr_1.0.10 ggplot2_3.3.6 mrgsolve_1.0.6
loaded via a namespace (and not attached):
[1] Rcpp_1.0.9 pillar_1.8.1 compiler_4.1.2 highr_0.9 R.utils_2.12.0 R.methodsS3_1.8.2 tools_4.1.2
[8] digest_0.6.29 R.cache_0.16.0 evaluate_0.17 lifecycle_1.0.3 tibble_3.1.8 gtable_0.3.1 pkgconfig_2.0.3
[15] rlang_1.0.6 cli_3.4.1 rstudioapi_0.14 yaml_2.3.5 xfun_0.33 fastmap_1.1.0 styler_1.7.0
[22] withr_2.5.0 knitr_1.40 generics_0.1.3 vctrs_0.4.2 fs_1.5.2 grid_4.1.2 getopt_1.20.3
[29] tidyselect_1.2.0 glue_1.6.2 optparse_1.7.3 R6_2.5.1 processx_3.7.0 fansi_1.0.3 rmarkdown_2.17
[36] purrr_0.3.5 callr_3.7.2 farver_2.1.1 clipr_0.8.0 magrittr_2.0.3 ps_1.7.1 scales_1.2.1
[43] htmltools_0.5.3 colorspace_2.0-3 labeling_0.4.2 utf8_1.2.2 munsell_0.5.0 R.oo_1.25.0
> packageVersion("MASS")
[1] ‘7.3.55’
Interestingly, there is a sharp cutoff at 40960
(10 * 2**12) where MASS:rvnorm()
breaks down. 40959
is fine while 40960
is not.
> set.seed(3392)
> mass <- MASS::mvrnorm(40960, 0, as.matrix(0.025)) %>% as.numeric()
> length(mass)
[1] 40960
> var(mass)
[1] 2.430878
> set.seed(3392)
> mass <- MASS::mvrnorm(40959, 0, as.matrix(0.025)) %>% as.numeric()
> length(mass)
[1] 40959
> var(mass)
[1] 0.02495838
Here is my version of RcppArmadillo
, FWIW. That might be the difference between our installations.
> packageVersion("RcppArmadillo")
[1] ‘0.11.4.0.1’
Yeah ... weird. I know MASS::mvrnorm()
doesn't rely on RcppArmadillo at all but still giving weird results.
On my Ubuntu instance, no problems with 40960 either
> library(dplyr)
Attaching package: ‘dplyr’
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
> set.seed(3392)
> mass <- MASS::mvrnorm(40960, 0, as.matrix(0.025)) %>% as.numeric()
> length(mass)
[1] 40960
> var(mass)
[1] 0.02495818
> sessionInfo()
R version 4.1.3 (2022-03-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.6 LTS
Even larger numbers work
> mass <- MASS::mvrnorm(4096000, 0, as.matrix(0.025)) %>% as.numeric()
> length(mass)
[1] 4096000
> var(mass)
[1] 0.02499063
Hmmm, something is going wrong for me in the matrix multiplication in MASS::mvrnorm()
that scales the standard-normal vector. I'm not that familiar with the R ecosystem: would the implementation of %*%
be related at all to your use of Armadillo's matrix multiplication here?
This does seem like something that is related more to my installation, somehow, than a bug in mrgsolve
per se. I'm using the standard R installation from Ubuntu 22.04's apt repository. I'll try the latest release from R's apt repo soon. If anything about my description of the issue makes any lightbulbs go off, I'd be happy to hear it, but that would be above and beyond.
set.seed(12345)
n <- 40960
mu <- 0
Sigma <- as.matrix(0.025)
p <- length(mu)
eS <- eigen(Sigma, symmetric = TRUE)
ev <- eS$values
X <- matrix(rnorm(p * n), n)
scaler <- diag(sqrt(pmax(ev, 0)), p)
scaler
#> [,1]
#> [1,] 0.1581139
temp <- scaler %*% t(X)
var(t(temp))
#> [,1]
#> [1,] 2.40582
Y <- drop(mu) + eS$vectors %*% temp
nm <- names(mu)
if(is.null(nm) && !is.null(dn <- dimnames(Sigma))) nm <- dn[[1L]]
dimnames(Y) <- list(nm, NULL)
Z <- t(Y)
var(Z)
#> [,1]
#> [1,] 233.4444
Created on 2022-10-12 with reprex v2.0.2
would the implementation of %*% be related at all to your use of Armadillo's matrix multiplication here?
It's possible that they both tap the same BLAS routines to do that; that's the only thing that I know of where they could possibly intersect. But I don't know enough of the Armadillo internals to know if they tap BLAS or not. Other than that, it's hard to imagine what those two could have in common; but they must be sharing something for both to be going haywire like this.
Hmm, I seem to have some global installation of the Intel MKL in /lib/x86_64-linux-gnu
for some reason (some non-packaged software decided it needed to place those there, I am guessing). RcppArmadillo.so
, at least, is linking against that as the "system LAPACK" that it found during the build process. Let me see if I can undo that.
❯ ldd ~/R/x86_64-pc-linux-gnu-library/4.1/RcppArmadillo/libs/RcppArmadillo.so | grep mkl
libmkl_rt.so => /lib/x86_64-linux-gnu/libmkl_rt.so (0x00007f8ca6c00000)
Ah, no it's from the official Ubuntu Intel MKL packages. Removing the intel-mkl-full
package (and all of its dependencies) seems to fix both mrgsolve
and MASS:mvrnorm()
.
Thank you very much for your help. I wouldn't have been able to pin it down without the suggestion to look at MASS:mvrnorm()
. I hope this information helps you with future reports about unreproducibility in the EPS
noise.
Oh great. Thanks for reporting and I'm glad we got it figured out. Always an adventure working on different platforms!
Kyle
Hello, I am running consecutive Monte Carlo simulations for the same set of individuals. After each simulation, the ETAs (PPVs) stay the same for each individual, but the additive error changes. As a result, the DVs are different and the IPREDs are the same. This is despite me setting the same seed before each time I run the simulation and in my population code.
This is an example of the outputs I get after simulation 1 and 2.
Simulation 1:
Simulation 2:
Am a bit stumped as to why everything else stays the same but the additive error changes... and despite setting the seed. Any help would be greatly appreciated! Thank you.