PPO model evaluation with docker

ghost commented 4 years ago

Hi!

I trained PPO agent model from habitat_baselines folder on pointnav task with gibson dataset. I wanted to evaluate trained model locally using docker. I created the following ppo agent submission script:

import argparse
import habitat
import random
import numpy
import os

from habitat.config import Config
from habitat.config.default import get_config
from habitat_baselines.agents.ppo_agents import PPOAgent

def get_default_config():
    c = Config()
    c.INPUT_TYPE = "rgbd" 
    c.MODEL_PATH = "models/ckpt.199.pth"  # my trained model 
    c.RESOLUTION = 256
    c.HIDDEN_SIZE = 512
    c.RANDOM_SEED = 7
    c.PTH_GPU_ID = 1
    c.GOAL_SENSOR_UUID = "pointgoal"
    return c

def main():
    parser = argparse.ArgumentParser()
    parser.add_argument("--evaluation", type=str, required=True, choices=["local", "remote"])
    args = parser.parse_args()

    agent_config = get_default_config()
    agent = PPOAgent(agent_config)

    if args.evaluation == "local":
        challenge = habitat.Challenge(eval_remote=False)
    else:
        challenge = habitat.Challenge(eval_remote=True)

    challenge.submit(agent)

if __name__ == "__main__":
    main()

I built the docker and after the command "sudo ./test_locally_pointnav_rgbd.sh --docker-name ppo_submission" got the following error:

2020-03-11 18:50:08,104 Initializing dataset PointNav-v1
2020-03-11 18:50:08,108 initializing sim Sim-v0
2020-03-11 18:50:09,020 Initializing task Nav-v0
Traceback (most recent call last):
  File "ppo_agent.py", line 41, in <module>
    main()
  File "ppo_agent.py", line 37, in main
    challenge.submit(agent)
  File "/habitat-api/habitat/core/challenge.py", line 19, in submit
    metrics = super().evaluate(agent)
  File "/habitat-api/habitat/core/benchmark.py", line 159, in evaluate
    return self.local_evaluate(agent, num_episodes)
  File "/habitat-api/habitat/core/benchmark.py", line 133, in local_evaluate
    action = agent.act(observations)
  File "/habitat-api/habitat_baselines/agents/ppo_agents.py", line 134, in act
    deterministic=False,
  File "/habitat-api/habitat_baselines/rl/ppo/policy.py", line 40, in act
    observations, rnn_hidden_states, prev_actions, masks
  File "/opt/conda/envs/habitat/lib/python3.6/site-packages/torch-1.4.0-py3.6-linux-x86_64.egg/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/habitat-api/habitat_baselines/rl/ppo/policy.py", line 167, in forward
    perception_embed = self.visual_encoder(observations)
  File "/opt/conda/envs/habitat/lib/python3.6/site-packages/torch-1.4.0-py3.6-linux-x86_64.egg/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/habitat-api/habitat_baselines/rl/models/simple_cnn.py", line 147, in forward
    return self.cnn(cnn_input)
  File "/opt/conda/envs/habitat/lib/python3.6/site-packages/torch-1.4.0-py3.6-linux-x86_64.egg/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/opt/conda/envs/habitat/lib/python3.6/site-packages/torch-1.4.0-py3.6-linux-x86_64.egg/torch/nn/modules/container.py", line 100, in forward
    input = module(input)
  File "/opt/conda/envs/habitat/lib/python3.6/site-packages/torch-1.4.0-py3.6-linux-x86_64.egg/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/opt/conda/envs/habitat/lib/python3.6/site-packages/torch-1.4.0-py3.6-linux-x86_64.egg/torch/nn/modules/linear.py", line 87, in forward
    return F.linear(input, self.weight, self.bias)
  File "/opt/conda/envs/habitat/lib/python3.6/site-packages/torch-1.4.0-py3.6-linux-x86_64.egg/torch/nn/functional.py", line 1370, in linear
    ret = torch.addmm(bias, input, weight.t())
RuntimeError: size mismatch, m1: [1 x 99712], m2: [25088 x 512] at /pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:290

I am wondering what could cause such error? Please help)

erikwijmans commented 4 years ago

Looks like the resolution is different than expected. We ran last year's challenge with 256x256, but now it's 640x480 (99712 / 25088 == (640x480)/(256x256)).

mathfac commented 4 years ago

As @erikwijmans mentioned make sure you used latest challenge config: challenge_pointnav2020.local.rgbd.yaml

ghost commented 4 years ago

@mathfac @erikwijmans thanks for answers! As I understand I trained ppo model with last year's challenge configuration. So, now the question is how to train ppo model (from habitat-api/habitat_baselines) with gibson dataset for pointnav task with 2020 challenge configuration ?

Should I use configurations from challenge_pointnav2020.local.rgbd.yaml instead of pointav_rgbd.yaml?

mathfac commented 4 years ago

@AdventureO, correct. Make sure you're using latest version.

facebookresearch / habitat-challenge

PPO model evaluation with docker #21