allenai / allenact

An open source framework for research in Embodied-AI from AI2.
https://www.allenact.org
Other
308 stars 49 forks source link

Unable to access testing scenes in RoboTHOR #360

Closed xiaobaishu0097 closed 1 year ago

xiaobaishu0097 commented 1 year ago

Problem

I am unable to access the testing scenes in RoboTHOR. When I try to open a testing scene following the ObjectNav Baseline, the program crashes and I receive an error message.

Steps to reproduce

Steps to reproduce the behavior:

  1. Go to 'Baseline models ObjectNav (for RoboTHOR/iTHOR)'
  2. Try to run command: export SAVED_MODEL_PATH=pretrained_model_ckpts/robothor-objectnav-challenge-2021/Objectnav-RoboTHOR-RGBD-ResNetGRU-DDPPO/2021-02-09_22-35-15/exp_Objectnav-RoboTHOR-RGBD-ResNetGRU-DDPPO_0.2.0a_300M__stage_00__steps_000170207237.pt python main.py projects/objectnav_baselines/experiments/robothor/objectnav_robothor_rgbd_resnetgru_ddppo.py -c $SAVED_MODEL_PATH --eval
  3. See error message about unknown scenes

Expected behavior

Evaluation on the testing scenes in RoboTHOR.

Error Message

[01/25 18:47:26 INFO:] Starting 0-th SingleProcessVectorSampledTasks generator with args {'mp_ctx': <multiprocessing.context.ForkServerContext object at 0x7f825ef6ddf0>, 'scenes': ['FloorPlan_test-challenge1_2'], 'object_types': ('AlarmClock', 'Apple', 'BaseballBat', 'BasketBall', 'Bowl', 'GarbageCan', 'HousePlant', 'Laptop', 'Mug', 'SprayBottle', 'Television', 'Vase'), 'max_steps': 500, 'sensors': [<allenact_plugins.ithor_plugin.ithor_sensors.RGBSensorThor object at 0x7f829000ea90>, <allenact_plugins.robothor_plugin.robothor_sensors.DepthSensorThor object at 0x7f825ef6dc40>, <allenact_plugins.ithor_plugin.ithor_sensors.GoalObjectTypeThorSensor object at 0x7f825ef6dfd0>], 'action_space': Discrete(6), 'seed': 1282648386, 'deterministic_cudnn': False, 'rewards_config': {'step_penalty': -0.01, 'goal_success_reward': 10.0, 'failed_stop_reward': 0.0, 'shaping_weight': 0}, 'env_args': {'width': 400, 'height': 300, 'commit_id': 'f0825767cd50d69f666c7f282e54abfe58f1e917', 'stochastic': True, 'continuousMode': True, 'applyActionNoise': True, 'rotateStepDegrees': 30.0, 'visibilityDistance': 1.0, 'gridSize': 0.25, 'snapToGrid': False, 'agentMode': 'locobot', 'fieldOfView': 63.453048374758716, 'include_private_scenes': False, 'renderDepthImage': True, 'gpu_device': 1, 'platform': <class 'ai2thor.platform.CloudRendering'>, 'all_metadata_available': False}, 'scene_directory': '/home/*/Code/allenact_heming/datasets/robothor-objectnav/test', 'loop_dataset': False}.  [vector_sampled_tasks.py: 1102]
Process ForkServerProcess-1:2:
Traceback (most recent call last):
  File "/home/*/miniconda3/envs/allenact/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/home/*/miniconda3/envs/allenact/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/*/Code/allenact_heming/allenact/algorithms/onpolicy_sync/vector_sampled_tasks.py", line 346, in _task_sampling_loop_worker
    sp_vector_sampled_tasks = SingleProcessVectorSampledTasks(
  File "/home/*/Code/allenact_heming/allenact/algorithms/onpolicy_sync/vector_sampled_tasks.py", line 897, in __init__
    self._vector_task_generators: List[Generator] = self._create_generators(
  File "/home/*/Code/allenact_heming/allenact/algorithms/onpolicy_sync/vector_sampled_tasks.py", line 1116, in _create_generators
    if next(generators[-1]) != "started":
  File "/home/*/Code/allenact_heming/allenact/algorithms/onpolicy_sync/vector_sampled_tasks.py", line 962, in _task_sampling_loop_generator_fn
    current_task = task_sampler.next_task()
  File "/home/*/Code/allenact_heming/allenact_plugins/robothor_plugin/robothor_task_samplers.py", line 460, in next_task
    self.env.reset(scene_name=scene)
  File "/home/*/Code/allenact_heming/allenact_plugins/robothor_plugin/robothor_environment.py", line 314, in reset
    self.controller.reset(scene_name)
  File "/home/*/miniconda3/envs/allenact/lib/python3.8/site-packages/ai2thor/controller.py", line 670, in reset
    raise ValueError(
ValueError: 
Scene 'FloorPlan_test-challenge4_1' not contained in build (scene names are case sensitive).
Please choose one of the following scene names:
ArchitecTHOR-Test-00, ArchitecTHOR-Test-01, ArchitecTHOR-Test-02, ArchitecTHOR-Test-03, ArchitecTHOR-Test-04, ArchitecTHOR-Val-00, ArchitecTHOR-Val-01, ArchitecTHOR-Val-02, ArchitecTHOR-Val-03, ArchitecTHOR-Val-04, FloorPlan10_physics, FloorPlan11_physics, FloorPlan12_physics, FloorPlan13_physics, FloorPlan14_physics, FloorPlan15_physics, FloorPlan16_physics, FloorPlan17_physics, FloorPlan18_physics, FloorPlan19_physics, FloorPlan1_physics, FloorPlan201_physics, FloorPlan202_physics, FloorPlan203_physics, FloorPlan204_physics, FloorPlan205_physics, FloorPlan206_physics, FloorPlan207_physics, FloorPlan208_physics, FloorPlan209_physics, FloorPlan20_physics, FloorPlan210_physics, FloorPlan211_physics, FloorPlan212_physics, FloorPlan213_physics, FloorPlan214_physics, FloorPlan215_physics, FloorPlan216_physics, FloorPlan217_physics, FloorPlan218_physics, FloorPlan219_physics, FloorPlan21_physics, FloorPlan220_physics, FloorPlan221_physics, FloorPlan222_physics, FloorPlan223_physics, FloorPlan224_physics, FloorPlan225_physics, FloorPlan226_physics, FloorPlan227_physics, FloorPlan228_physics, FloorPlan229_physics, FloorPlan22_physics, FloorPlan230_physics, FloorPlan23_physics, FloorPlan24_physics, FloorPlan25_physics, FloorPlan26_physics, FloorPlan27_physics, FloorPlan28_physics, FloorPlan29_physics, FloorPlan2_physics, FloorPlan301_physics, FloorPlan302_physics, FloorPlan303_physics, FloorPlan304_physics, FloorPlan305_physics, FloorPlan306_physics, FloorPlan307_physics, FloorPlan308_physics, FloorPlan309_physics, FloorPlan30_physics, FloorPlan310_physics, FloorPlan311_physics, FloorPlan312_physics, FloorPlan313_physics, FloorPlan314_physics, FloorPlan315_physics, FloorPlan316_physics, FloorPlan317_physics, FloorPlan318_physics, FloorPlan319_physics, FloorPlan320_physics, FloorPlan321_physics, FloorPlan322_physics, FloorPlan323_physics, FloorPlan324_physics, FloorPlan325_physics, FloorPlan326_physics, FloorPlan327_physics, FloorPlan328_physics, FloorPlan329_physics, FloorPlan330_physics, FloorPlan3_physics, FloorPlan401_physics, FloorPlan402_physics, FloorPlan403_physics, FloorPlan404_physics, FloorPlan405_physics, FloorPlan406_physics, FloorPlan407_physics, FloorPlan408_physics, FloorPlan409_physics, FloorPlan410_physics, FloorPlan411_physics, FloorPlan412_physics, FloorPlan413_physics, FloorPlan414_physics, FloorPlan415_physics, FloorPlan416_physics, FloorPlan417_physics, FloorPlan418_physics, FloorPlan419_physics, FloorPlan420_physics, FloorPlan421_physics, FloorPlan422_physics, FloorPlan423_physics, FloorPlan424_physics, FloorPlan425_physics, FloorPlan426_physics, FloorPlan427_physics, FloorPlan428_physics, FloorPlan429_physics, FloorPlan430_physics, FloorPlan4_physics, FloorPlan5_physics, FloorPlan6_physics, FloorPlan7_physics, FloorPlan8_physics, FloorPlan9_physics, FloorPlan_ExpRoom, FloorPlan_Train10_1, FloorPlan_Train10_2, FloorPlan_Train10_3, FloorPlan_Train10_4, FloorPlan_Train10_5, FloorPlan_Train11_1, FloorPlan_Train11_2, FloorPlan_Train11_3, FloorPlan_Train11_4, FloorPlan_Train11_5, FloorPlan_Train12_1, FloorPlan_Train12_2, FloorPlan_Train12_3, FloorPlan_Train12_4, FloorPlan_Train12_5, FloorPlan_Train1_1, FloorPlan_Train1_2, FloorPlan_Train1_3, FloorPlan_Train1_4, FloorPlan_Train1_5, FloorPlan_Train2_1, FloorPlan_Train2_2, FloorPlan_Train2_3, FloorPlan_Train2_4, FloorPlan_Train2_5, FloorPlan_Train3_1, FloorPlan_Train3_2, FloorPlan_Train3_3, FloorPlan_Train3_4, FloorPlan_Train3_5, FloorPlan_Train4_1, FloorPlan_Train4_2, FloorPlan_Train4_3, FloorPlan_Train4_4, FloorPlan_Train4_5, FloorPlan_Train5_1, FloorPlan_Train5_2, FloorPlan_Train5_3, FloorPlan_Train5_4, FloorPlan_Train5_5, FloorPlan_Train6_1, FloorPlan_Train6_2, FloorPlan_Train6_3, FloorPlan_Train6_4, FloorPlan_Train6_5, FloorPlan_Train7_1, FloorPlan_Train7_2, FloorPlan_Train7_3, FloorPlan_Train7_4, FloorPlan_Train7_5, FloorPlan_Train8_1, FloorPlan_Train8_2, FloorPlan_Train8_3, FloorPlan_Train8_4, FloorPlan_Train8_5, FloorPlan_Train9_1, FloorPlan_Train9_2, FloorPlan_Train9_3, FloorPlan_Train9_4, FloorPlan_Train9_5, FloorPlan_Train_Generated, FloorPlan_Val1_1, FloorPlan_Val1_2, FloorPlan_Val1_3, FloorPlan_Val1_4, FloorPlan_Val1_5, FloorPlan_Val2_1, FloorPlan_Val2_2, FloorPlan_Val2_3, FloorPlan_Val2_4, FloorPlan_Val2_5, FloorPlan_Val3_1, FloorPlan_Val3_2, FloorPlan_Val3_3, FloorPlan_Val3_4, FloorPlan_Val3_5, FloorPlan_test-dev2_2, Procedural, ProceduralNoLight, ProceduralSet, extreme_lighting_test, set_doors

Desktop

Please add the following information:

Additional context

Add any other context about the problem here.

Lucaweihs commented 1 year ago

Hi @xiaobaishu0097, this error looks like the task is attempting to use an incorrect (i.e. more recent) AI2-THOR build during testing. Can you confirm that you're using the latest version of AllenAct (commit == 24907f16cd6aace1abb2fef90c8e8667859c38b8) and that no changes have been made to the repository before running the above command? Note that the command actually shouldn't be able to run as that the file path

projects/objectnav_baselines/experiments/robothor/objectnav_robothor_rgbd_resnetgru_ddppo.py

has changed to

projects/objectnav_baselines/experiments/robothor/objectnav_robothor_rgbd_resnet18gru_ddppo.py
xiaobaishu0097 commented 1 year ago

Hello @Lucaweihs, Thanks for your quick reply and kind reminders. I have noticed the missing project file and corrected the command to the right file.

For the error, I have updated the latest version of Allenact, and the same error is raised again. The error might be caused by AI2-THOR platform instead of Allenact. I am trying to run an evaluation on RoboTHOR testing scenes (e.g., FloorPlan_test-challenge1_1). However, those scenes are not accessible in any version of AI2-THOR according to the error message.

I am just curious whether AI2-THOR no longer supports those scenes.

Lucaweihs commented 1 year ago

Hi @xiaobaishu0097,

The testing scenes are not included in all AI2-THOR builds, thankfully you can access old THOR builds very easily by specifying the commit_id parameter in the ai2thor.controller.Controller constructor. For instance, can you try running

from ai2thor.controller import Controller
c = Controller(commit_id="bad5bc2b250615cb766ffb45d455c211329af17e")
c.reset("FloorPlan_test-challenge1_1")
print(c.last_event.metadata["sceneName"]) # Prints the current scene name

And confirm that no error occurs and that the scene was successfully set to be "FloorPlan_test-challenge1_1"?

If the above works, I'm a bit confused as to why you're getting the error in AllenAct as we explicitly set the THOR build commit_id to be the above commit_id (see this line).

xiaobaishu0097 commented 1 year ago

Thanks for your quick reply. I have found the problem with this commit.

Since I am using headless servers, which cannot run startx.py successfully, I choose to run the Allenact in headless mode. However, when the Allenact is running in headless mode, the commit_id will be changed to ai2thor.build.COMMIT_ID (see here), which is f0825767cd50d69f666c7f282e54abfe58f1e917. Meanwhile, may I ask why AI2-THOR did not include those testing scenes in the latest versions?

To try to solve the issue, I'm trying to install the Allenact in a docker container. Meanwhile, since the ai2thor-docker cannot be installed correctly, I aim to use Nvidia Pytorch Docker Image instead, i,e., nvcr.io/nvidia/pytorch:22.12-py3. I've followed these steps:

docker run -it -d --name="allenact" --gpus 'all' --env DISPLAY:$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix:rw --pid=host --ipc=host --network=host --privileged=True -v /path/to/.ssh:/root/.ssh nvcr.io/nvidia/pytorch:22.12-py3
docker exec -it allenact bash

Then, I try to run the script python ./scripts/startx.py in allenact directory. Since I run the docker as root, I did not use sudo in running the python file. However, I get the following error message:

Traceback (most recent call last):
  File "./scripts/startx.py", line 101, in <module>
    startx()
  File "./scripts/startx.py", line 73, in startx
    for r in pci_records():
  File "./scripts/startx.py", line 17, in pci_records
    output = subprocess.check_output(command).decode()
  File "/usr/lib/python3.8/subprocess.py", line 415, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/usr/lib/python3.8/subprocess.py", line 493, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/usr/lib/python3.8/subprocess.py", line 858, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/usr/lib/python3.8/subprocess.py", line 1704, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'lspci'

Following the suggestion in another issue, I install the pciutils by apt install pciutils and re-run the python ./scripts/startx.py. Then, the docker raises a new error message:

Traceback (most recent call last):
  File "./scripts/startx.py", line 101, in <module>
    startx()
  File "./scripts/startx.py", line 89, in startx
    proc = subprocess.Popen(command)
  File "/usr/lib/python3.8/subprocess.py", line 858, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/usr/lib/python3.8/subprocess.py", line 1704, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'Xorg'

Unfortunately, an error after executing 'python ./scripts/startx.py' while attempting to install 'xserver-xorg' using 'DEBIAN_FRONTEND=noninteractive apt install xserver-xorg', following the suggestion in another issue.

_XSERVTransSocketUNIXCreateListener: ...SocketCreateListener() failed
_XSERVTransMakeAllCOTSServerListeners: server already running
(EE)
Fatal server error:
(EE) Cannot establish any listening sockets - Make sure an X server isn't already running(EE)
(EE)
Please consult the The X.Org Foundation support
         at http://wiki.x.org
 for help.
(EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information.
(EE)
(EE) Server terminated with error (1). Closing log file.

Could you please help me on this matter? Thanks for your time!

xiaobaishu0097 commented 1 year ago

I have managed to start X services on the headless machine without docker at final. I think that might be the best solution for now.

Lucaweihs commented 1 year ago

Hi @xiaobaishu0097 ,

Sorry for the delay! I'm happy to hear you managed to make things work by starting the X-display.

Meanwhile, may I ask why AI2-THOR did not include those testing scenes in the latest versions?

Two reasons:

  1. We've taken some special precautions in that build to ensure that the agent metadata is not included when opening a test scene. This is necessary to lower the chance that people accidentally train on these scenes or use the ground truth metadata to solve the task.
  2. As AI2-THOR receives updates, there are small changes to scenes and agent behavior. For fairness when people report results to the leaderboard it is easiest to require that everyone uses the same build.
xiaobaishu0097 commented 1 year ago

Thanks for your kind reply!