Closed apoorvkh closed 2 years ago
This pull request introduces 4 alerts when merging 1ad4dc883c97683adc99f8ee30c3f781d47f69ea into 9da8674e7781370b4c257eab707a613e953c002f - view on LGTM.com
new alerts:
Accidentally created this PR with the wrong branch. See #329 instead.
Adding visual and text encoders from CLIP for use in RoboTHOR ObjectNav. Can invoke in "zeroshot" mode (where objects are split into seen/unseen sets and their names are encoded with CLIP's text encoder). Or, can just replace the visual encoder with CLIP's ResNet.
Prerequisites
Training example