cshizhe / VLN-HAMT

Official implementation of History Aware Multimodal Transformer for Vision-and-Language Navigation (NeurIPS'21).
MIT License
99 stars 12 forks source link