jxx123 / simglucose

A Type-1 Diabetes simulator implemented in Python for Reinforcement Learning purpose
MIT License
251 stars 114 forks source link

Error in risk_index #38

Open eseglo opened 3 years ago

eseglo commented 3 years ago

Hi, I think there might be an error in risk_index. In lines 12 and 13 rl = 10 * fBG[fBG < 0]**2 rh = 10 * fBG[fBG > 0]**2 If we pass a list of BG, those lines are removing the elements with fBG(BG)<0 or fBG>0, and afterwards the mean is computed. But if we look at the paper "Clarke, W., & Kovatchev, B. (2009). Statistical tools to analyze continuous glucose monitor data. Diabetes Technology and Therapeutics, 11(SUPPL.1). https://doi.org/10.1089/dia.2008.0138" the definition LBGI and HBGI for a series of n BG reading is the average of the rl and rh for each BG reading, so we use all n readings. With the above implementation the rl=0.0 or rh=0.0 samples are removed from the average. Consider the following example, where we pass 10 samples to risk_index with an horizon of 10

BG | rl | rh 122 | 0.0000 | 0.2277 155 | 0.0000 | 3.5834 144 | 0.0000 | 2.1229 120 | 0.0000 | 0.1441 200 | 0.0000 | 11.6047 350 | 0.0000 | 45.5737 456 | 0.0000 | 69.5785 131 | 0.0000 | 0.8055 70 | 7.7552 | 0.0000 555 | 0.0000 | 90.7512

We get LBGI=7.7552, HBGI=24.93 But, if we take all the 10 samples and compute the average, then we should get LBGI=0.77552, that is, it should divide by 10 not 1 to take all the 10 samples. What do you think?

jxx123 commented 3 years ago

Hi,

Thanks for raising this.

The risk index function has a horizon meaning how many samples in the past you want to average on. For example, at time = 10, horizon = 3, the risk index is based on the average BG at 8, 9, 10. See this code https://github.com/jxx123/simglucose/blob/c51e7f2de9918bed25b75174b8ddb21c4d960923/simglucose/analysis/risk.py#L5.

The risk index I show in the animation is the instantaneous risk, meaning horizon = 1, no average is done there. See here https://github.com/jxx123/simglucose/blob/c51e7f2de9918bed25b75174b8ddb21c4d960923/simglucose/simulation/env.py#L84.

The risk index in the final report is the average of risk per hour, meaning I computed the risk using the last hour BG at each time step, and then I did an average of the risk per hour to come up with the final risk. See details here https://github.com/jxx123/simglucose/blob/c51e7f2de9918bed25b75174b8ddb21c4d960923/simglucose/analysis/report.py#L127.

Let me know if you have further questions. Thanks!

eseglo commented 2 years ago

Hi, thanks for the answer. Sorry to come back to this so late. I am not sure if I understand you.

I mean, in my example, even if we call the risk_index function as risk_index(BG,10), with the vector of BG samples with a horizon=10 for, instance, the average for the LBGI is done just with one sample, because it removes all the samples whose rl is 0, which in that case (see above) is just all except for one. If you try and call it with risk_index(BG,3) with the same BG vector you can see that the LBGI result is the same, because it only takes the sample for which rl!=0.

My point is that, in that particular example with a BG of 10 samples, if you call risk_index(BG,10) the correct result should be LBGI=7.7552/10 and with risk_index(BG,3) it should be LBGI=7.7552/3. In other words, the average in that function should be done by dividing by horizon, not by the number of samples with rl!=0.