Closed wenjunli-0 closed 3 years ago
Can you share the final exception/error you encounter? The message you shared is just a warning that training is not possible, but evaluation should still work. Setting the environment should definitely help.
Can you share the final exception/error you encounter? The message you shared is just a warning that training is not possible, but evaluation should still work. Setting the environment should definitely help.
Below are the output messages. The program will be stuck in step-2, i.e. when i=2. I waited for more than 10 minutes to get the results of step-2. However, the evaluation for model_0.zip and model_1.zip only take several seconds. As the i increases, the evaluation time seems to be longer and longer. I assume there are some bugs, otherwise, a normal evaluation wouldn't be so slow.
Loading a model without an environment, this model cannot be trained until it has a valid environment.
Loading a model without an environment, this model cannot be trained until it has a valid environment.
Loading a model without an environment, this model cannot be trained until it has a valid environment.
Average testing performance at curriculum step-0: Re=-47.68431662522045
Loading a model without an environment, this model cannot be trained until it has a valid environment.
Average testing performance at curriculum step-1: Re=-79.47954591762584
Loading a model without an environment, this model cannot be trained until it has a valid environment.
Average testing performance at curriculum step-2: Re=-480.195298409764
Loading a model without an environment, this model cannot be trained until it has a valid environment.
Below are the output messages. The program will be stuck in step-2, i.e. when i=2. I waited for more than 10 minutes to get the results of step-2. However, the evaluation for model_0.zip and model_1.zip only take several seconds. As the i increases, the evaluation time seems to be longer and longer. I assume there are some bugs, otherwise, a normal evaluation wouldn't be so slow.
I would double-check that the agent does indeed run (e.g. by creating a manual env-step loop and see how it works). 10min for LunarLander indeed sounds like way too much, but it could be that your agents are learning to play very long episodes, and thus it takes time to evaluate.
Below are the output messages. The program will be stuck in step-2, i.e. when i=2. I waited for more than 10 minutes to get the results of step-2. However, the evaluation for model_0.zip and model_1.zip only take several seconds. As the i increases, the evaluation time seems to be longer and longer. I assume there are some bugs, otherwise, a normal evaluation wouldn't be so slow.
I would double-check that the agent does indeed run (e.g. by creating a manual env-step loop and see how it works). 10min for LunarLander indeed sounds like way too much, but it could be that your agents are learning to play very long episodes, and thus it takes time to evaluate.
Thanks for your explanation and I may found the problem. The LunarLander environment does not have a max timestep per episode. So, in evaluation, the agent might be always flying in the sky. This issue does not happen during training but only in evaluation. After I add a max timestep limit to LunarLander env, this issue has been fixed.
I guess there is a max timestep limit in model.learn(), and no such limit in evaluate_policy().
I have trained a SAC model and stored the model at different timesteps. I want to evaluate each of them and see how they perform in the testing environment. However, I can load and evaluate model_0.zip and model_1.zip properly, and it failed when loading model_2.zip with this error "Loading a model without an environment, this model cannot be trained until it has a valid environment."
I tried several ways to fix this issue but all failed:
Could you please help me take a look and see how should I fix it? I've also tried DDPG, and it has the same bug.