Not able to load trained models

Singh-sid930 commented 3 years ago

Hi, I am trying to load the OccAnt(depth) models but to no luck. Getting the following errors :

Unexpected key(s) in state_dict: "mapper_state_dict", "local_state_dict", "global_state_dict", "config", "extra_state".

I am not doing evaluations through your code base but rather trying to replicate the test_occant.py file.

TASK_CONFIG = "OccupancyAnticipation/occant_baselines/config/ppo_exploration.yaml"
        cfg = get_config(TASK_CONFIG)
        occ_cfg = cfg.RL.ANS.OCCUPANCY_ANTICIPATOR
        OccNet = OccAntDepth(occ_cfg)
        OccNet.load_state_dict(torch.load("OccupancyAnticipation/ckpt.11.pth",map_location='cuda:0'))
        OccNet.eval()

I tried using the specific state dics like mapper_state_dict, but that also does not match with the model's state dict.

Singh-sid930 commented 3 years ago

Can you point me to the correct state dict that I should be using or if I am doing something wrong in terms of loading the model/creating the model.

srama2512 commented 3 years ago

This is likely because of a mismatch between the model checkpoint and the config. Did you try using this config?

Singh-sid930 commented 3 years ago

So I tried that and it gave me the same error. However I notice something when I use OccNet.load_state_dict(state_dict["mapper_state_dict"]) : Example of missing keys :

gp_depth_proj_encoder.inc.conv.conv.0.weight", "gp_depth_proj_encoder.inc.conv.conv.0.bias"

Example of unexpected keys :

 "mapper.projection_unit.main.main.gp_depth_proj_encoder.inc.conv.conv.0.weight",

Seems like mapper.project_unit.main.main. is just an additional string attached . Or I could be wrong.

this is my code running from OccupancyAnticipation/testOccantDepth.py which is pretty basic if you want to reproduce the error at your end. :

import torch

import habitat

from occant_baselines.models.occant import (
    ANSRGB,
    ANSDepth,
    OccAntRGB,
    OccAntDepth,
    OccAntRGBD,
    OccAntGroundTruth,
    OccupancyAnticipator,
)

from occant_baselines.config.default import get_config

TASK_CONFIG = "configs/model_configs/occant_depth/ppo_exploration.yaml"

bs = 16
V = 128

cfg = get_config(TASK_CONFIG)
occ_cfg = cfg.RL.ANS.OCCUPANCY_ANTICIPATOR

OccNet = OccAntDepth(occ_cfg)
state_dict = torch.load("ckpt.6.pth", map_location='cuda:0')
# print(state_dict["mapper_state_dict"])
OccNet.load_state_dict(state_dict["mapper_state_dict"])
OccNet.eval()

batch = {
    "rgb": torch.rand(bs, 3, V, V),
    "depth": torch.rand(bs, 1, V, V),
    "ego_map_gt": torch.rand(bs, 2, V, V),
    "ego_map_gt_anticipated": torch.rand(bs, 2, V, V),
}

y = OccNet(batch)

srama2512 commented 3 years ago

@Singh-sid930 - sorry for the late response. The above snippet will not work. You should be following the conventions used in the trainers for defining and loading the model. Here is the relevant snippet for why those additional strings are appended to the state_dict.

        occupancy_model = OccupancyAnticipator(occ_cfg)
        occupancy_model = OccupancyAnticipationWrapper(
            occupancy_model, mapper_cfg.map_size, (128, 128)
        )
        # Create ANS model
        self.ans_net = ActiveNeuralSLAMExplorer(ans_cfg, occupancy_model)
        self.mapper = self.ans_net.mapper
        # Load state dict
        self.mapper_agent.load_state_dict(ckpt_dict["mapper_state_dict"])

This is the definition of OccupancyAnticipator.

Singh-sid930 commented 3 years ago

In the meantime I had got this working by doing this small hack. I could tried the above method which didn't work correctly. High chance I am doing something wrong so I will try it out later. But just to confirm, the occAntDepth model class pertains to the this model : https://dl.fbaipublicfiles.com/OccupancyAnticipation/pretrained_models/occant_depth_ch/ckpt.13.pth And inside this model, thee "mapper_state_dict" key is what contains the mapper weights (plus a few more) This is my hack :

        occ_cfg = cfg.RL.ANS.OCCUPANCY_ANTICIPATOR
        OccNet = OccAntDepth(occ_cfg).cuda()
        state_dict = torch.load("OccupancyAnticipation/OccDepth.pth", map_location='cuda:0')
        state_dict = state_dict["mapper_state_dict"]

        new_state_dict = OrderedDict()

      #### hack
        for k,v in state_dict.items():
            if "mapper.projection_unit.main.main." in k:
                name = k.replace("mapper.projection_unit.main.main.","")
                new_state_dict[name] = v
        OccNet.load_state_dict(new_state_dict)

        OccNet.eval()

I just had a few questions about the inpput and output.

q1: I notice the input is a 128*128 tensor of two channels. As I understand it : channel 0 = occupancy probability channel 1 = exploration probability (confidence that it has been explored)

What is the purpose of exploration? We do not have the exploration probability in our setup. Is it necessary to estimate occupancy? What we have is a top down map of occupancy as seen by the robot and we can also accumulate that over steps. This map was created from the habitat depth data.

Or can we provide high confidence values to the pixels, that the robot has explore(be it free or occupied)

q2: For the 128,128 map, is the robot ego-centric to the center of the map or the bottom middle of the map ?

q3 : Is the output also the same semantically as the input in terms of what the channels represent?

q6: do the map inputs need to be maps which are accumulated over steps or only the ground truth map of the current step?

srama2512 commented 3 years ago

Yes. That checkpoint corresponds to OccAnt(depth). The method for loading the checkpoint looks good to me.

q1 - For non-anticipation models, channel 1 indicates whether that part of the space was observed or not (unobserved is 0, observed is 1). For anticipation models, channel 1 indicates the model's confidence about its prediction for that part of the map. Note that an anticipation model does not necessarily have to observe a location to predict its occupancy.

q2 - The robot starts at the center of the map facing upward. But the full map is not rotated as the agent moves.

q3, q6 - What input and output are you referring to here?

Singh-sid930 commented 3 years ago

By input I mean the dict that is to be sent to the model which in the case of OccAntDepth is

input = { 'ego_map_gt' : (bs, 2, H, W) }

And output is then

y = model(input)
output = y["occ_estimate"]

q3 : for the output also channel 0 = occupancy probability , channel 1 = prediction confidence ? q6: in the input of ego_map_gt is the map (the top down occupancy) a snapshot of the current step or the accumulation of all the past steps in the episode.

q1: quick clarification for the input then, in my ego_gt_map which I send to the OccAntDepth model, channel 1 does not necessarily need to have the any values which represent observation ? Just trying to understand what should I put in channel 1 of the ego_gt_map. We do track explored space, but not confidence of exploration.

srama2512 commented 3 years ago

ego_map_gt is basically the geometric projection of the depth sensor readings (non-anticipation). So the q1 response applies.

Here is a reference for how ego_map_gt is computed.

srama2512 commented 3 years ago

Closing this issue since the questions were addressed. Please feel free to re-open it if there are any more.

facebookresearch / OccupancyAnticipation

Not able to load trained models #30