ronkeizer / vpc

R library to create visual predictive checks (VPC)
Other
36 stars 20 forks source link

Problem when I take less time observations #33

Closed Laura3338 closed 7 years ago

Laura3338 commented 8 years ago

Hello, Thanks for your work on this package, it helps me a lot. I am a beginner on this subject so my question might be obvious. I am using an EDO model but I am not using NONMEM, I simulate my data using R (and I estimate the parameters of my model using NIMROD). I have a dataset of 128 patients that have different time of observation. So I have simulated 2000 times my dataset by taking the parameters in there posterior distribution (estimated thanks to the NIMROD program), and for each patient I use the same time of observation than the original one (if my patient 1 have an observation on time 1,7,10,11, so are the 2000 patient number 1 of my simulation). My problem is that some patients have observations until time 800, and other until time 150. So the end of the vpc (after 200) is less "accurate" since we selected the patient that should continue based on some physiological values. So I decided to take for all patient the observation only until the time 200. For that, I remove the data for simulation and observation when time>200.

My issue with the vpc is that when I do a vpc with all the data and a vpc with only the data with time<200, even the beginning of the vpc looks totally different, but the data (simulated and observed) are exactly the same. Is it a problem of is it that I don't understand how vpc works? Also in the restricted data, the "time" on the vpc begins at -100, even if on my data the time begins at 0.

Thanks in advance for your help, and sorry if my English is not perfect as it is not my native language.

Laura

vpc_400 vpc_all obs_all.txt

ronkeizer commented 8 years ago

hi Laura, I think the reason why the vpc plot looks slightly different between <200 and <800 is that binning is different. I think in both cases you used around 10 bins, but in each case they will be distributed over the complete timespan, so bins in the <200 plot will be much tighter. You can however pre-specify the bin-separators yourself too as argument (e.g. bins = c(0, 2, 4, 6, 8, 10, 16, 25) ). In that case the plots for <200 and <800 should be the same. Regarding the x-axis going below zero: I noticed that in your observed data there are some rows at the end with TIME = -99, which is probably the reason. br, Ron

Laura3338 commented 8 years ago

Hello, Thank you so much it explains all my issues! And for the TIME=-99 I completely forgot about that, it's the way I indicate my program (NIMROD) that my file is over, I removed it when I first did the analysis but I changed a few things and did'nt think of that. Thanks again for your quick help and your package,

Laura