jxx123 / simglucose

A Type-1 Diabetes simulator implemented in Python for Reinforcement Learning purpose
MIT License
240 stars 113 forks source link

Environment creates the same scenario between resets #10

Closed sumanabasu closed 3 years ago

sumanabasu commented 3 years ago

Consider this minimal example:

import gym
from gym.envs.registration import register

register(
    id='simglucose-adult001-v0',
        entry_point='simglucose.envs:T1DSimEnv',
        kwargs={'patient_name': 'adult#001'}
    )
env = gym.make('simglucose-adult001-v0')
env.seed(0)

env.reset()
>>> Observation(CGM=126.63092222821054)

env.reset()
>>> Observation(CGM=126.63092222821054)

The environment produces the same scenario Observation(CGM=126.63092222821054) on reset(), even on changing the seed unless I make the environment again. This shouldn't be the case as it keeps on generating the same episode.

Is this a potential bug, or am I missing something here? @jxx123

jxx123 commented 3 years ago

Thanks for bringing up the issue. If a seed is set, I don't think reset should reset the seed as well. For example, in this basic environment, reset won't reset the seed. So the example you show here is expected.

But as you said, the environment generates the same episode after changing the seed. This is definitely a bug. I have a temporary fix in my local branch. Will submit the fix soon. Thanks!

jxx123 commented 3 years ago

The fix is submitted. Now the results will be like this:

import gym
from gym.envs.registration import register

register(
    id='simglucose-adult001-v0',
        entry_point='simglucose.envs:T1DSimEnv',
        kwargs={'patient_name': 'adult#001'}
        )
env = gym.make('simglucose-adult001-v0')

env.seed(0)
print(env.reset())  # >>> Observation(CGM=157.25395877711873)
print(env.reset())  # >>> Observation(CGM=157.25395877711873)

env.seed(1)
print(env.reset())  # >>> Observation(CGM=134.48866253152116)
print(env.reset())  # >>> Observation(CGM=134.48866253152116)

Note that reset() won't change the random state, if you want to change the random state, use seed(another_seed) instead.

eseglo commented 3 years ago

Hi, I am not sure about the fix. Can we discuss it? In env.py.reset() everything is reset, that is, patient, sensor, pump and scenario. For patient, it is reset to her initial state. I am not familiar with the UVA/Padova model but I guess this may be a need of the model. In pump there is no random state. But for sensor and scenario, in their reset()methods, the random state set again with the initial seed. So it is generating exactly the same episode and these are the only sources of randomness. You can see below a snippet from a log: create_scenario {'meal': {'time': [363.0, 581.0, 803.0, 1158.0, 1340.0], 'amount': [35, 3, 71, 94, 7]}} _reset BG 149.01999999999998 _reset CGM 159.63251057672602 reset CGM 161.28268202215403 Called reset Resetting CGM sensor ... create_scenario {'meal': {'time': [363.0, 581.0, 803.0, 1158.0, 1340.0], 'amount': [35, 3, 71, 94, 7]}} _reset BG 149.01999999999998 _reset CGM 159.63251057672602 reset CGM 161.28268202215403

Is this the intended behavior? If it actually is, could you explain why? As I read from the gym documentation, the episode should be independent:

Thanks a lot @jxx123

 def reset(self):
        """Resets the environment to an initial state and returns an initial
        observation.
        Note that this function should not reset the environment's random
        number generator(s); random variables in the environment's state should
        be sampled independently between multiple calls to `reset()`. In other
        words, each call of `reset()` should yield an environment suitable for
        a new episode, independent of previous episodes.
        Returns:
            observation (object): the initial observation.
        """
jxx123 commented 3 years ago

@eseglo Thanks for the latest gym document (somehow my old version gym does not document this). I see the problem now. I misunderstood it before. So the requirement is that reset() won't change the seed or random state, but the episode should be different from the previous one after calling reset().

Cool, let me reopen this issue. Will work on a fix soon. Thanks!

sumanabasu commented 3 years ago

@jxx123 Thank you for trying to fix this! I was using the workaround presented below. Basically, I was changing the seed to the sensor and the scenario every time after I call the reset(). I thought this generates new episodes as:

env = gym.make('simglucose-adult001-v0')

env.reset()
env.env.sensor.seed = seed
env.env.scenario.seed = seed

So, if we set these two seeds inside the old _seed() function that you had, it looks about right to me. Although, we might want to move this line to the _init_ if we do not want to change the self.np_random between resets. If I understand @eseglo 's comment, it also conforms to the gym guidelines, unless I'm missing some other random number generator.

I was wondering how else do you plan to generate new episodes without changing these 2 seeds. In my understanding, they are the only sources of randomness in the model outside the random number generators. Could you please explain? It will help expedite my research without having to wait for the fix.

On a different note, for research purposes, I am curious about how you've generated the patient information for this simulator. Is it the same as the UVA Padova in silico patient data? I thought that's private information given UVA Padova is a paid simulator. I'm trying to understand the benefits of having the full UVA Padova matlab simulator and which data sources of this simulator I need to change if I get hold of the UVA Padova.

Thanks again for your time, effort, and enthusiasm! I might also want to help you build the documentation for the repo since I've been spending quite some time playing with it.

jxx123 commented 3 years ago

@sumanabasu Thank you so much for your comments!

jxx123 commented 3 years ago

The fix is in. Will make the initial BG settable in another PR.