MarSaKi / ETPNav

[TPAMI 2024] Official repo of "ETPNav: Evolving Topological Planning for Vision-Language Navigation in Continuous Environments"
MIT License
189 stars 19 forks source link

Feature Extraction Codes for Pretraining #8

Open LYX0501 opened 1 month ago

LYX0501 commented 1 month ago

Hai An,

Thanks for your ETPNav's open-sourced codes. ETPnav is a fascinating work! I open this issue to ask where to find the feature processing codes for pertaining. Or will you plan to open this part of codes?

MarSaKi commented 1 month ago

Thanks for your attention, Sorry, I've left the original affiliation and I cannot find the feature extraction code, alternatively, you can modify this scripts to extract feature vectors from habitat sim. Please note the above scripts extract grid features from habitat-sim, you could change it to extract feature vectors. Best regards,

LYX0501 commented 1 month ago

Hey, An. Thanks for your response! I have seen your mentioned scripts. I just want to confirm 2 points: whether both rgb and depth features for ETPNav

MarSaKi commented 1 month ago

Hey, An. Thanks for your response! I have seen your mentioned scripts. I just want to confirm 2 points: whether both rgb and depth features for ETPNav

Exactly, ETPNav needs both RGB and depth feature vectors for pre-training, the RGB is extracted by CLIP-ViT-B/32 while the depth is extracted by "gibson-4plus-mp3d-train-val-test-resnet50.pth" of this link.

MarSaKi commented 1 month ago

I've updated the feature extraction code, you can have a try. https://github.com/MarSaKi/ETPNav/blob/main/precompute_img_features/run.bash

LYX0501 commented 1 month ago

Thank you very much for your help! Undoubtedly, ETPNav provides an excellent benchmark for future research on VLN.