allenai / procthor

🏘️ Scaling Embodied AI by Procedurally Generating Interactive 3D Houses
https://procthor.allenai.org/
Apache License 2.0
241 stars 20 forks source link

Some ProcTHOR Files May Be Corrupted #2

Open brandontrabucco opened 1 year ago

brandontrabucco commented 1 year ago

Hello AI2 Team,

This issue references the rearrangement extension of procthor (it looks like issues can't be filed on that, perhaps since its a template):

https://github.com/jordis-ai2/ai2thor-rearrangement-procthor/tree/procthor-2022

It seems like certain files in data/2022procthor are corrupted. See the below steps to reproduce:

git clone https://github.com/jordis-ai2/ai2thor-rearrangement-procthor.git -b procthor-2022
cd ai2thor-rearrangement-procthor/data/2022procthor/split_train
pip install compress_json

Then in a python interpreter:

>>> import compress_json
>>> compress_json.load("scene_names.json.gz")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/user/anaconda3/lib/python3.8/site-packages/compress_json/compress_json.py", line 195, in load
    json_content = json.load(file, **json_kwargs)
  File "/home/user/anaconda3/lib/python3.8/json/__init__.py", line 293, in load
    return loads(fp.read(),
  File "/home/user/anaconda3/lib/python3.8/gzip.py", line 292, in read
    return self._buffer.read(size)
  File "/home/user/anaconda3/lib/python3.8/gzip.py", line 479, in read
    if not self._read_gzip_header():
  File "/home/user/anaconda3/lib/python3.8/gzip.py", line 427, in _read_gzip_header
    raise BadGzipFile('Not a gzipped file (%r)' % magic)
gzip.BadGzipFile: Not a gzipped file (b've')

I get the same error when attempting to train your procthor model:

allenact -b baseline_configs/one_phase/procthor one_phase_rgb_clip_dagger \
 -s 12345 --config_kwargs '{"distributed_nodes":1}'

Thanks for helping resolve this!

Brandon

mattdeitke commented 1 year ago

@jordis-ai2 @Lucaweihs can you take a look? It appears to be part of how the rearrangement houses were distributed.

Should we move them to the prior package? That would probably solve these types of issue.

jordis-ai2 commented 1 year ago

Hi @brandontrabucco,

I can open the file on my side, but, in principle, you should be able to delete it, since it should be re-created here.

Let us know if that helps.

jordis-ai2 commented 1 year ago

@jordis-ai2 @Lucaweihs can you take a look? It appears to be part of how the rearrangement houses were distributed.

Should we move them to the prior package? That would probably solve these types of issue.

Oh, I see. So it seems to be a problem with the way I'm using github-lfs there. Then the same issue might apply to all other data files, and those should definitely not be regenerated for repeatability.

I have an open question in the PR (under review by @Lucaweihs) regarding whether it would be best to move to a prior package or just follow the convention in the rest of the rearrangement repository. I'm happy with either option.

brandontrabucco commented 1 year ago

Thanks for responding to this quickly! I'm following your new instructions in the README for the rearrangement procthor dataset, but I'm encountering a new error after running inv install-procthor-dataset:

$ inv install-procthor-dataset
Traceback (most recent call last):
  File "/anaconda3/envs/rearrange/bin/inv", line 8, in <module>
    sys.exit(program.run())
  File "/anaconda3/envs/rearrange/lib/python3.8/site-packages/invoke/program.py", line 384, in run
    self.execute()
  File "/anaconda3/envs/rearrange/lib/python3.8/site-packages/invoke/program.py", line 566, in execute
    executor.execute(*self.tasks)
  File "/anaconda3/envs/rearrange/lib/python3.8/site-packages/invoke/executor.py", line 129, in execute
    result = call.task(*args, **call.kwargs)
  File "/anaconda3/envs/rearrange/lib/python3.8/site-packages/invoke/tasks.py", line 127, in __call__
    result = self.body(*args, **kwargs)
  File "/code/ai2thor-rearrangement-procthor/tasks.py", line 671, in install_procthor_dataset
    all_data = prior.load_dataset("procthor_rearrangement_2022")
  File "/anaconda3/envs/rearrange/lib/python3.8/site-packages/prior/__init__.py", line 269, in load_dataset
    sha, token = _clone_repo(
  File "/anaconda3/envs/rearrange/lib/python3.8/site-packages/prior/__init__.py", line 191, in _clone_repo
    raise Exception(
Exception: Could not find dataset.
If you're using a private repo, override the github auth token with:
    import prior
    prior.gh_auth_token = <token>
Alternatively, you can set the environment variable with:
    export GITHUB_TOKEN=<token>
from the command line.

Is this expected?

jordis-ai2 commented 1 year ago

Unfortunately, yes. I didn't notify about the new commit in this thread because the dataset is temporarily internal (waiting for feedback from the team in order to complete the design). As a workaround, using git-lfs with the former commit should get you going, as far as I understand, if access to the set of 10k small procthor houses is public (I have no control on this side of things). Note that the data in the procthor_rearrangement_2022 repository only encompasses the sets of episodes used for pre-training and validation in our published experiments, not the houses specifications.

brandontrabucco commented 1 year ago

Following up on this issue,

When I attempt to use ProcTHOR with the rearrangement challenge via https://github.com/jordis-ai2/ai2thor-rearrangement-procthor/tree/procthor-2022 (files downloaded with git-lfs), I am encountering a new issue.

from baseline_configs.one_phase.procthor.one_phase_rgb_clip_dagger \
    import ProcThorOnePhaseRGBClipResNet50DaggerTrainMultiNodeConfig

config = ProcThorOnePhaseRGBClipResNet50DaggerTrainMultiNodeConfig()

task_sampler = config.make_sampler_fn(
    **config.stagewise_task_sampler_args(
        stage="train", 
        process_ind=0, 
        total_processes=1, 
        devices=[0]), 
        force_cache_reset=False, 
        epochs=1)

task = task_sampler.next_task()

The traceback is:

Traceback (most recent call last):
  File "main.py", line 15, in <module>
    task = task_sampler.next_task()
  File "/code/ai2thor-rearrangement-procthor/rearrange/tasks.py", line 1203, in next_task
    raise e
  File "/code/ai2thor-rearrangement-procthor/rearrange/tasks.py", line 1132, in next_task
    self.unshuffle_env.reset(
  File "/code/ai2thor-rearrangement-procthor/rearrange/procthor_rearrange/environment.py", line 750, in reset
    self._task_spec_reset(
  File "/code/ai2thor-rearrangement-procthor/rearrange/procthor_rearrange/environment.py", line 641, in _task_spec_reset
    self.procthor_reset(
  File "/code/ai2thor-rearrangement-procthor/rearrange/procthor_rearrange/environment.py", line 569, in procthor_reset
    self.controller.step(
  File "/anaconda3/envs/rearrange/lib/python3.8/site-packages/ai2thor/controller.py", line 961, in step
    raise RuntimeError(
RuntimeError: KeyNotFoundException: The given key was not present in the dictionary.. trace:   at System.Collections.Generic.Dictionary`2[TKey,TValue].get_Item (TKey key) [0x0001e] in <695d1cc93cca45069c528c15c9fdd749>:0 
  at Thor.Procedural.AssetMap`1[T].getAsset (System.String name) [0x00000] in <6afd8f78be764eeba7be30f178fa1cb8>:0 
  at Thor.Procedural.ProceduralTools.CreateHouse (Thor.Procedural.Data.ProceduralHouse house, Thor.Procedural.AssetMap`1[T] materialDb, System.Nullable`1[T] position) [0x0035a] in <6afd8f78be764eeba7be30f178fa1cb8>:0 
  at UnityStandardAssets.Characters.FirstPerson.BaseFPSAgentController.CreateHouse (Thor.Procedural.Data.ProceduralHouse house) [0x000e1] in <6afd8f78be764eeba7be30f178fa1cb8>:0 
  at (wrapper managed-to-native) System.Reflection.MonoMethod.InternalInvoke(System.Reflection.MonoMethod,object,object[],System.Exception&)
  at System.Reflection.MonoMethod.Invoke (System.Object obj, System.Reflection.BindingFlags invokeAttr, System.Reflection.Binder binder, System.Object[] parameters, System.Globalization.CultureInfo culture) [0x00032] in <695d1cc93cca45069c528c15c9fdd749>:0

When I follow the debugging steps used in this thread (https://github.com/allenai/procthor-10k/issues/6) and set PROCTHOR_COMMIT_ID = "391b3fae4d4cc026f1522e5acf60953560235971", rather than its original value of PROCTHOR_COMMIT_ID = "90eac925dc750818890069e3131f899998dc58b4", the code progresses further, but then generates a secondary error.

Traceback (most recent call last):
  File "main.py", line 15, in <module>
    task = task_sampler.next_task()
  File "/code/ai2thor-rearrangement-procthor/rearrange/tasks.py", line 1203, in next_task
    raise e
  File "/code/ai2thor-rearrangement-procthor/rearrange/tasks.py", line 1132, in next_task
    self.unshuffle_env.reset(
  File "/code/ai2thor-rearrangement-procthor/rearrange/procthor_rearrange/environment.py", line 750, in reset
    self._task_spec_reset(
  File "/code/ai2thor-rearrangement-procthor/rearrange/procthor_rearrange/environment.py", line 712, in _task_spec_reset
    self.controller.step(
  File "/anaconda3/envs/rearrange/lib/python3.8/site-packages/ai2thor/controller.py", line 959, in step
    raise ValueError(self.last_event.metadata["errorMessage"])
ValueError: 
        Action: "OpenObject" called with invalid arguments: 'actionSimulationSeconds', 'fixedDeltaTime'
        Expected arguments: String objectId, Boolean forceAction = False, Single openness = 1, Nullable`1 moveMagnitude = 
        Your arguments: 'objectId', 'openness', 'forceAction', 'actionSimulationSeconds', 'fixedDeltaTime'
        Valid ways to call "OpenObject" action:
                Void OpenObject(String objectId, Boolean forceAction = False, Single openness = 1, Nullable`1 moveMagnitude = )
                Void OpenObject(Single x, Single y, Boolean forceAction = False, Single openness = 1, Nullable`1 moveMagnitude = )

It seems like the version of AI2-THOR is not compatible with environment.py. Have you seen this kind of error before, and do you know how it might be solved?

Thanks, Brandon

jordis-ai2 commented 1 year ago

Hi @brandontrabucco,

I could replicate the error in my setup. I would suggest keeping using PROCTHOR_COMMIT_ID = "90eac925dc750818890069e3131f899998dc58b4". The error comes from the Houses utility class, where the new interface was not ensuring data consistency. I'm pushing a new commit fixing the issue.

Thanks for the heads up!