askforalfred / alfred

ALFRED - A Benchmark for Interpreting Grounded Instructions for Everyday Tasks
MIT License
360 stars 77 forks source link

Is that possible to navigate directly in Ai2-Thor environment? #62

Closed StOnEGiggity closed 3 years ago

StOnEGiggity commented 3 years ago

Hi,

I have one question here. Is that possible to navigate directly in Ai2-Thor environment? I notice training data includes discrete vision features. I want to directly run Ai2-Thor environment and obtain the status of the interactive objects from API.

I am new to AI2-thor and ALFRED. If I make some misunderstanding about AI2-Thor and ALFRED, please point out my mistakes.

Thanks a lot.

MohitShridhar commented 3 years ago

@StOnEGiggity the seq2seq training uses a static dataset to train the models in an offline fashion. If you are interested in using additional info from the simulator you have two options:

  1. Augment Dataset: Run augment_trajectories.py to replay all the trajectories in the dataset, save additional info about objects etc., then train models offline with the augmented dataset.
  2. Online Training: Instead of a static dataset, use something like eval_seq2seq.py for online interactions. Currently eval_seq2seq.py takes a pre-trained model and evaluates it interactively with the simulator (with navigation and manipulation). You will need to modify it to your needs.
ankit61 commented 3 years ago

Are there plans to support the latest AI2Thor version 2.7.4? It seems that it speeds up rendering which could make it practical to apply (sample-inefficient) RL algorithms directly in the environment. I am new to both AI2Thor and ALFRED, but if you are aware of the changes needed to support the latest version, I could try to implement them myself

MohitShridhar commented 3 years ago

Hi @ankit61, this pull request thread might be relevant: https://github.com/askforalfred/alfred/pull/45

Unfortunately minor changes in the scene layouts between 2.1.0 and 2.7.4 would invalidate some referring expressions and instructions in the dataset. But nonetheless you can pre-train an agent in 2.7.4 to learn robust visual representations and possibly finetune it with language data in 2.1.0.

MohitShridhar commented 3 years ago

Closing due to inactivity.