Ram81 / goat-bench

55 stars 6 forks source link

RuntimeError due to input size mismatch when running evaluation of goat_skill_chain #23

Open fylk317 opened 1 month ago

fylk317 commented 1 month ago

Input Size Mismatch in goat_skill_chain Evaluation

Problem Description

When running the evaluation of goat_skill_chain, two main issues are encountered:

  1. The checkpoint file ckpt.121.pth is missing.
  2. After commenting out the missing checkpoint, the following error occurs when loading cache:
Cache dir: data/goat-assets/goal_cache/iin/val_seen_embeddings/wcojb4TFT35_embedding.pkl
  0%|                                                                                                                                                                                               | 0/360 [00:00<?, ?it/s]
RuntimeError: input.size(-1) must be equal to input_size. Expected 1376, got 1632

The error might stem from a mismatch in RNN input configurations:

Configuration 1:

Observation space info:
  compass: Box(-3.1415927, 3.1415927, (1,), float32)
  current_subtask: Box(0, 4, (1,), int32)
  gps: Box(-3.4028235e+38, 3.4028235e+38, (2,), float32)
  language_goal: Box(-inf, inf, (768,), float32)
  rgb: Box(0, 255, (224, 224, 3), uint8)

Near objectgoal policy: objectgoal
Near CLIP obj/img policy: clip_objectgoal -  clip_imagegoal
In CLIP policy: language_goal - True
Language embedding: 768, Add Language linear: False

RNN input size info: 
  prev_action: 32
  language_goal: 768
  gps_embedding: 32
  compass_embedding: 32
  visual_feats: 512
  --------------------------
  Total RNN input size: 1376

Configuration 2:

Observation space info:
  cache_instance_imagegoal: Box(-inf, inf, (1024,), float32)
  compass: Box(-3.1415927, 3.1415927, (1,), float32)
  current_subtask: Box(0, 4, (1,), int32)
  gps: Box(-3.4028235e+38, 3.4028235e+38, (2,), float32)
  rgb: Box(0, 255, (224, 224, 3), uint8)

Near objectgoal policy: objectgoal
Near CLIP obj/img policy: clip_objectgoal -  clip_imagegoal
In CLIP policy: language_goal - False
InstanceImage embedding: 1024, Add Instance linear: False

RNN input size info: 
  prev_action: 32
  instance_goal: 1024
  gps_embedding: 32
  compass_embedding: 32
  visual_feats: 512
  --------------------------
  Total RNN input size: 1632

Request for Assistance

I would greatly appreciate any guidance on how to resolve this input size mismatch and ensure compatibility between the two configurations. Thank you in advance for your help!

Ram81 commented 1 month ago

@fylk317 can you share the command you are using to run this eval? If you are using this command with paths to correct embedding files it should be working

fylk317 commented 1 month ago

Hi @Ram81,

Here’s the command I’m using for eval:

DATA_PATH="data/datasets/goat_bench/hm3d/v1/"
tensorboard_dir="data/goat-assets/tensorboard/_logs/"
eval_ckpt_path_dir="data/goat-assets/checkpoints/sense_act_nn_skill_chain/"
split="val_seen"

python -um goat_bench.run \
  --run-type eval \
  --exp-config config/experiments/ver_goat_skill_chain.yaml \
  habitat_baselines.num_environments=1 \
  habitat_baselines.trainer_name="goat_ppo" \
  habitat_baselines.rl.policy.name=GoatHighLevelPolicy \
  habitat_baselines.tensorboard_dir=$tensorboard_dir \
  habitat_baselines.eval_ckpt_path_dir=$eval_ckpt_path_dir \
  habitat_baselines.checkpoint_folder=$eval_ckpt_path_dir \
  habitat.dataset.data_path="${DATA_PATH}/${split}/${split}.json.gz" \
  +habitat/task/lab_sensors@habitat.task.lab_sensors.clip_objectgoal_sensor=clip_objectgoal_sensor \
  +habitat/task/lab_sensors@habitat.task.lab_sensors.language_goal_sensor=language_goal_sensor \
  +habitat/task/lab_sensors@habitat.task.lab_sensors.cache_instance_imagegoal_sensor=cache_instance_imagegoal_sensor \
  ~habitat.task.lab_sensors.goat_goal_sensor \
  habitat.task.lab_sensors.cache_instance_imagegoal_sensor.cache=data/goat-assets/goal_cache/iin/${split}_embeddings/ \
  habitat.task.lab_sensors.language_goal_sensor.cache=data/goat-assets/goal_cache/language_nav/${split}_instruction_clip_embeddings.pkl \
  habitat_baselines.load_resume_state_config=False \
  habitat_baselines.eval.use_ckpt_config=False \
  habitat_baselines.eval.split=$split \
  habitat_baselines.eval.should_load_ckpt=False \
  habitat_baselines.should_load_agent_state=False

I’m using the instruction_clip_embeddings.pkl instead of bert_embedding.pkl since the bert-embedding file is missing in goat-assets. Could this difference be causing the error? If so, could you let me know where the bert_embedding.pkl file is located?

Thanks!