ActivitySim / activitysim

An Open Platform for Activity-Based Travel Modeling
https://activitysim.github.io
BSD 3-Clause "New" or "Revised" License
191 stars 99 forks source link

ActivitySim Random Seed/Stochasticity Issue #576

Closed siweih1997 closed 2 months ago

siweih1997 commented 2 years ago

Hi ActivitySim technical support,

We are from University of California, Irvine, teaming with SANDAG. We are running Activitysim for a couple of times with different random seeds, to see whether it could produce the same results.

We found the set_base_seed() function from random.py, is related to the seeds and randomness.

def set_base_seed(self, seed=None):
    """
    Like seed for numpy.random.RandomState, but generalized for use with all random streams.

    Provide a base seed that will be added to the seeds of all random streams.
    The default base seed value is 0, so set_base_seed(0) is a NOP

    set_base_seed(1) will (e.g.) provide a different set of random streams than the default
    but will provide repeatable results re-running or resuming the simulation

    set_base_seed(None) will set the base seed to a random and unpredictable integer and so
    provides "fully pseudo random" non-repeatable streams with different results every time

    Must be called before first step (before any channels are added or rands are consumed)

    Parameters
    ----------
    seed : int or None
    """

    if self.step_name is not None or self.channels:
        raise RuntimeError("Can only call set_base_seed before the first step.")

    assert len(list(self.channels.keys())) == 0

    if seed is None:
        self.base_seed = np.random.RandomState().randint(_MAX_SEED)
        logger.debug("Set random seed randomly to %s" % self.base_seed)
    else:
        logger.debug("Set random seed base to %s" % seed)
        self.base_seed = seed

And we also found that the set_base_seed() function is called at three different places:

So we changed the input of the function set_base_seed() from "0" to "None" in Line 509 of the pipeline.py file: From get_rn_generator().set_base_seed(inject.get_injectable('rng_base_seed', 0)) to get_rn_generator().set_base_seed(inject.get_injectable('rng_base_seed', None))

We expect there might be some randomness. But very surprisingly, we found that all the results are the same across the 5 runs.

Can you help us understand what's going wrong?

JoeJimFlood commented 2 years ago

The place to edit the random seed is in Line 64 of core/config.py (Should this be configurable in the settings file?). However, setting that value to None results in an error due to the data type of the random seed. I just submitted a pull request to fix that.

jfdman commented 2 years ago

Hi Joe, yes it should be configurable by adding "rng_base_seed: x” where x is an integer in the settings.yaml file. However, @fxie-mwcog confirms that this has no effect on results. Can you please work with @jpn-- to get this fix into the next release (if it hasn't been addressed yet)? A number of folks want to investigate simulation variance. Thanks!!

jpn-- commented 2 years ago

@fxie-mwcog, did you find no randomness when changing the fixed seed to a different fixed seed (this would be a different problem than described above, which we also should fix), or just when changing to "None" which should fall back to non-reproducible randomness but doesn't (same problem as above)?

xiex0055 commented 2 years ago

It's the former. We did not test 'rng_base_seed: None'. We tested 'rng_base_seed: 0' and 'rng_base_seed: 1' but they both generated the same modeling results as the original model run (which does not have any 'rng_base_seed' syntax in the settings file)

JoeJimFlood commented 2 years ago

@jfdman @jpn-- @xiex0055 I looked through the source code and didn't find anywhere that reads the rng_base_seed value from the settings file (unless there's a way other than config.setting() that I'm missing). I only found where it's set as an injectable and it looks like as it stands it's set to be 0. I tried creating an environment in which I made the following change to config.py from:

@inject.injectable()
def rng_base_seed():
    return 0

to:

@inject.injectable()
def rng_base_seed():
    try:
        return setting("rng_base_seed")
    except KeyError:
        return 0

I then ran a few tests of prototype_mtc: two with rng_base_seed set to 0, two with it set to 1, one with it set to None (which I learned today is done by typing rng_base_seed: null in the file), and two without any explicit definition in the settings file. I then got the number of tours by mode for each run (results attached). The runs with the seed set to 0 matched, as did the runs set to 1, and they were different to each other. The run where it was set to None was different. However, I was expecting the runs where the seed was undefined to be the same as when it was set to 0. They were both different from the rest, leading me to believe that when something isn't in the settings file a KeyError is not raised and the value is just set to None. Is this understanding correct?

Should I update my pull request with this change? Or without the try/except statements?

jfdman commented 1 year ago

@jpn-- I think this issue is fixed with the latest release? if so please close.

xiex0055 commented 2 months ago

@jfdman MWCOG staff recently conducted test model runs with different random seed specifications and obtained slightly different results. We can confirm that this has been fixed in the latest ActivitySim release. As you suggested, @jpn-- may close this issue.