Open jdshaolinstar opened 3 years ago
3 different environments each with its own dataset.....
If you have datasets, you can probably compute the statistics and normalize in advance, so you should not need VecNormalize
, no?
Hmm.... That's actually a great idea... Before using stable baselines, I was normalizing the data. VecNormalize just seems like an intelligent wrapper that does alot of good things. The reason I don't normalize the data in advance is because I want the ability to receive live ( unknown scale ) data in a production setting, not just from an existing dataset. So I guess I could normalize the data before feeding it into the env, but I feel I would just be re inventing what VecNormalize probably does well already. Also, I believe VecNormalize also takes care of reward normalization? I'm using that alongside MlpLnLstmPolicy (normalized Lstm) policy.
@araffin I'm assuming that VecNormalize is applying weights to the inputs that the environment receives itself and has nothing to do with whats saved in the model. So loading different normalized vectors between environments probably doesn't matter too much. ( I hope ). So perhaps that question is solved. I'm not sure if this should be a different topic, but in regards to multiprocessing, I've read that with recurrent networks, the model has to be tested with the same number of envs that it's trained on. If I spawn 5 separate python processes and load the same model, could I save them each to the same model file after their individual episodes? Or is there some shared logic between steps that actually needs to happen? Asking the same question a different way, when I save a training to a model file, are the weights being intelligently blended together, regardless of if I'm training the same model in 5 different processes, or will the newer process overwrite the progress of the concurrent model saves?
when I save a training to a model file, are the weights being intelligently blended together, regardless of if I'm training the same model in 5 different processes, or will the newer process overwrite the progress of the concurrent model saves?
When using VecEnv
, only the env computation is separated between processes, the model and gradient updates are done in the same process.
When using MPI (with PPO1 for instance), there is synchronization being done after each gradient step.
Hello. I've read the docs for how to use vecNormalize for an environment. I feel it is unclear as to how to approach using vecnormalize when training an agent across multiple environments.
Given I want to save and load a model training on 3 different environments each with its own dataset..... Which option makes more sense? a.) save and load 3 different vecNormalized environments file ? `env.save(env_1_vec_file) .... env = VecNormalize.load(env_1_vec_file, env3)
env.save(env_2_vec_file) .... env = VecNormalize.load(env_2_vec_file, env3)
... env.save(env_3_vec_file) ... env = VecNormalize.load(env_3_vec_file, env3)`
b.) `save and load all env to 1 shared_vecnormalize_file env.save(shared_vec_file) .... env = VecNormalize.load(shared_vec_file, env3)
env.save(shared_vec_file) .... env = VecNormalize.load(shared_vec_file, env3)
... env.save(shared_vec_file) ... env = VecNormalize.load(shared_vec_file, env3)`