Closed rginjapan closed 1 year ago
So you mean you did the fine-tuning for your pipeline?
For the feed-forward method, yes. You can check more details from our paper.
I will, but will you plan to public the fine-tuning code in the repo?
Thanks!Could you please guide more about how to use(train) fine-tuning pipeline?
BTW, where is "versioned_data"?
Object-Goal-Navigation/ data/ scene_datasets/ matterport_category_mappings.tsv object_norm_inv_perplexity.npy versioned_data objectgoal_hm3d/ train/ val/ val_mini/
After I have read your paper, I think the zero-shot you mentioned is just no need to train with LLM. You still specified the object category in evaluation, so it is not a task-oriented zeor-shot. Am I right?
Thanks for reading our paper. Yes, as I said before, fine-tuning is only for the feed-forward method. You can find the versioned_data after downloading the HM3D dataset. Because in this task the target objects are fixed, we use more informative objects to find the relevance. If the target object is open-vocabulary, specifying the object category would not be suitable.
So what is the definition of zero-shot in your approach?
In this paper, zero-shot means that we don't need to train the semantic policy to learn the semantic relevance from the training dataset. We can also select the special object categories based on object relevance in the language model if you think the category selection from the scene dataset is not suitable as zero-shot, but we just use this as a reference, and we have to choose such those object categories in the limited semantic labels.
Is it possible to test on other category in evaluation? Thanks for your kindheart response!
Sorry to bother again, after I downloaded the mini_val of hm3d, there is no train and val, do I need to download --uids hm3d_val and hm3d_train? Thanks!
Sure! You can test on other categories. But you need to generate the hm3d evaluation dataset by yourself. If you have problems with the HM3D dataset, please check related repos such as habitat-lab and habitat-sim.
If you use the zero-shot paradigm, it doesn't need to train. For the feed-forward method, the training of the fine-tuning network is necessary.