Currently some tasks use torch randomness for loading assets. This is problematic since the assets loaded into the environment is dependent on a fixed seed and a variable number of environments. Moreover, CPU and GPU torch randomness do not generate the same results when given the same seed.
Two ways to fix:
Give users a list of seeds per environment for asset loading. This way regardless of number of environments each individual env is built with a given seed (and you can flexibly change the order and get the same assets loaded). Would require to also strongly encourage users to use this per env seed setup and use np.random.set_seed() very frequently themselves.
Give users a class wrapped over numpy random that generates random numbers in batch like normal but uses a different seed internally for each entry of the batch of data generated. This class maintains a list of already created numpy RNG objects to do this. Somewhat slower but that is fine since this is reconfiguration (rarely done).
Regardless a list of seeds are required in some form. This will also update the reset function options argument to accept a per_env_seeds key to directly determine each envs used seed instead of auto generate the list of env seeds given one seed.
Currently some tasks use torch randomness for loading assets. This is problematic since the assets loaded into the environment is dependent on a fixed seed and a variable number of environments. Moreover, CPU and GPU torch randomness do not generate the same results when given the same seed.
Two ways to fix:
Regardless a list of seeds are required in some form. This will also update the reset function options argument to accept a
per_env_seeds
key to directly determine each envs used seed instead of auto generate the list of env seeds given one seed.