Farama-Foundation / MicroRTS-Py

A simple and highly efficient RTS-game-inspired environment for reinforcement learning (formerly Gym-MicroRTS)
MIT License
234 stars 45 forks source link

Error Running Evaluation after training. #116

Closed rahuldwivedi1112 closed 1 year ago

rahuldwivedi1112 commented 1 year ago

getting the below error after running command for evaluation : python ppo_gridnet_eval.py \ --agent-model-path models/MicroRTSGridModeVecEnv__ppo_gridnet21689622827/agent.pt \
--ai coacAI

Traceback (most recent call last): File "/Users//MicroRTS-Py/experiments/ppo_gridnet_eval.py", line 150, in agent.load_state_dict(torch.load(args.agent_model_path, map_location=device)) File "/Users/Library/Caches/pypoetry/virtualenvs/gym-microrts-4rVzI_IH-py3.9/lib/python3.9/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for Agent: Missing key(s) in state_dict: "encoder.7.weight", "encoder.7.bias", "encoder.10.weight", "encoder.10.bias", "actor.4.weight", "actor.4.bias", "actor.6.weight", "actor.6.bias". size mismatch for actor.0.weight: copying a param with shape torch.Size([64, 32, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 128, 3, 3]). size mismatch for actor.0.bias: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for actor.2.weight: copying a param with shape torch.Size([32, 78, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 64, 3, 3]). size mismatch for actor.2.bias: copying a param with shape torch.Size([78]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for critic.1.weight: copying a param with shape torch.Size([128, 1024]) from checkpoint, the shape in current model is torch.Size([128, 256]).

DennisSoemers commented 1 year ago

I am not the original developer of this code, but at a very quick glance, it looks like simply a model size mismatch to me. Note that, when running ppo_gridnet_eval.py, the default value for the --model-type parameter is "ppo_gridnet_large". A model of that type is not trained by running ppo_gridnet.py, but by running ppo_gridnet_large.py.

If you do want to evaluate a model of the smaller type, you can pass --model-type "ppo_gridnet" to ppo_gridnet_eval.py on the command line.

rahuldwivedi1112 commented 1 year ago

Thank you