jxx123 / simglucose

A Type-1 Diabetes simulator implemented in Python for Reinforcement Learning purpose
MIT License
251 stars 114 forks source link

Risk Index probably incorrect #63

Open drozzy opened 1 year ago

drozzy commented 1 year ago

UPDATE: See my update in a post below - this also has a problem.

This here is simply adding two means: https://github.com/jxx123/simglucose/blob/c51e7f2de9918bed25b75174b8ddb21c4d960923/simglucose/analysis/risk.py#L16

However, adding up two means (or taking the mean) of sub-sequences that are different in length does not equal the mean of the whole sequence: Screenshot_20230824_120745

Instead we have to weigh the each individual mean by the length of sequence it represents. Like this:

weight_rl = len(rl) / horizon
weight_rh = len(rh) / horizon
RI = weight_rl*LBGI + weight_rh*HBGI

While risk index is not explicitly defined in any paper I can find (neither here nor here) - only the low and high risk indicies are defined: Screenshot_20230824_120215

it is not unreasonable to assume that it should be defined such that risk function applied to the whole trajectory would result in the same mean as the combination of low and high risk indicies, e.g.:

Screenshot_20230824_120424

The nice thing about weighing the LBGI and HBGI is that the final risk then is still bounded between 0 and 100.

jxx123 commented 1 year ago

Thanks for raising this issue.

The simple add-up comes from the original UVa/Padova Simulator, but I agree that your proposed weighted average makes more sense, and it will bring the final risk index into the same scale of LBGI and HBGI. If you are interested, feel free to submit a PR to the repo, otherwise, I will do that.

FYI, there is a related risk index computation error reported here https://github.com/jxx123/simglucose/pull/62, which is due to the numpy version compatibility, and there is already a PR to fix this computation error.

drozzy commented 1 year ago

This is a follow up to my previous solution. It still suffers from a problem: the risk values are not capped to [0, 100] as they should be. This stems from the assumption in the paper that the bg values should be within [20, 600] mg/dL. Outside of that range we cannot depend on natural log to behave.

Here is the updated implementation - this should directly replace the contents of risk.py:

import numpy as np

def risk_index(BG, horizon):
    # BG is in mg/dL
    BG_to_compute = BG[-horizon:]
    risks =[risk(r) for r in BG_to_compute]
    LBGI = np.mean([r[0] for r in risks])
    HBGI = np.mean([r[1] for r in risks])
    RI = np.mean([r[2] for r in risks])

    return (LBGI, HBGI, RI)

def risk(BG):
    MIN_BG = 20.0
    MAX_BG = 600.0
    if BG <= MIN_BG: 
        return (100.0, 0.0, 100.0)
    if BG >= MAX_BG:
        return (0.0, 100.0, 100.0)

    U = 1.509 * (np.log(BG)**1.084 - 5.381)

    ri = 10 * U**2

    rl, rh = 0.0, 0.0
    if U <= 0:
        rl = ri
    if U >= 0:
        rh = ri
    return (rl, rh, ri)
drozzy commented 1 year ago

Opened a PR with the proposed changes. Closed the old PR.