allenai / allenact

An open source framework for research in Embodied-AI from AI2.
https://www.allenact.org
Other
313 stars 50 forks source link

Adding CLIP encoders #328

Closed apoorvkh closed 2 years ago

apoorvkh commented 2 years ago

Adding visual and text encoders from CLIP for use in RoboTHOR ObjectNav. Can invoke in "zeroshot" mode (where objects are split into seen/unseen sets and their names are encoded with CLIP's text encoder). Or, can just replace the visual encoder with CLIP's ResNet.

Prerequisites

pip install numpy-quaternion
pip install numpy==1.21
pip install ftfy regex tqdm
pip install git+https://github.com/openai/CLIP.git

Training example

PYTHONPATH=. allenact -b projects/objectnav_baselines/experiments/robothor objectnav_robothor_zeroshot_rgb_clipgru_ddppo
lgtm-com[bot] commented 2 years ago

This pull request introduces 4 alerts when merging 1ad4dc883c97683adc99f8ee30c3f781d47f69ea into 9da8674e7781370b4c257eab707a613e953c002f - view on LGTM.com

new alerts:

apoorvkh commented 2 years ago

Accidentally created this PR with the wrong branch. See #329 instead.