Closed araffin closed 2 years ago
On it 😄 !
Here's what I've done, but I'm not sure of the result 🤔 :
In package_to_hub
# If we normalizing input features, get and save the VecNormalize statistics
vecnorm = model.get_vec_normalize_env()
if vecnorm:
# Save the VecNormalize statistics to the repo
vecnorm_path = Path(repo_local_path) / "vec_normalize.pkl"
vecnorm.save(vecnorm_path)
# Load the VecNormalize statistics
eval_env = VecNormalize.load(vecnorm_path, eval_env)
# Do not update VecNormalization stats at test time
eval_env.training = False
# Reward normalization is not needed at test time
eval_env.norm_reward = False
We retrieve the vecnorm from the model using get_vec_normalize_env()
helper since we pass the model as a parameter in package_to_hub
.
https://github.com/huggingface/huggingface_sb3/blob/feature-vecnorm/huggingface_sb3/push_to_hub.py
I made some tests here: https://colab.research.google.com/drive/1qnWiwBKt5EYsPB3wyQRWJ_TT-P9vKdYJ?usp=sharing
Here what it looks like: https://huggingface.co/ThomasSimonini/TEST15ppo-Walker2DBulletEnv-v0
vecnorm = model.get_vec_normalize_env()
you should use unwrap_vec_normalize
because you are usually using an eval env not connected to the model (for instance when you load the model), otherwise the rest is correct ;)
So in this case it implies that I need to add a env parameter to package_to_hub. I can't use eval_env for that (vecnorm = unwrap_vec_normalize(eval_env)) given we didn't trained VecNormalization stats with it 🤔
That's already the case, no?
(vecnorm = unwrap_vec_normalize(eval_env)) given we didn't trained VecNormalization stats with it
vecnorm=None
if there is no VecNormalize
, you mostly need to make sure eval_env
is a vec env (if not, you know that there is no normalization anyway).
Oh, looking at your implementation, I think we are thinking differently. I just wanted to save the stats if needed and keep the rest the same (assuming that eval env is already properly wrapped).
Okay just to be sure:
We have env that is a vec env. When we train our model model.learn(env), we also normalize the environment (input features and reward).
Now, I need to get this vec_normalize.pkl (from env) because if I get the one from the eval_env they are useless since they were not updated during the training right?
In package to hub I need to evaluate the agent, if I need to evaluate it, I need to get the eval_env (that I assume being properly wrapped) and load the vec_normalize.pkl (except I don't normalize rewards this time).
So my idea was:
What process you were thinking about? Because I'm not sure mine is the best idea 🤔
We have env that is a vec env. When we train our model model.learn(env), we also normalize the environment (input features and reward).
yes, unless you load a model (for instance best model using eval callback)
because if I get the one from the eval_env they are useless since they were not updated during the training right?
It depends how you define eval env, but normally you should already use the stats from the training env. In the eval callback, we actually synchronize the two.
(that I assume being properly wrapped)
properly wrapped for me also mean that it has the correct stats, so you could discard model.env
completely.
So my idea was:
this would work only if you do the packaging right after training and it requires that the model has an env
My idea was simpler:
Detect and save automatically normalization when present (using for instance
unwrap_vec_normalize()
helper).