YicongHong / Recurrent-VLN-BERT

Code of the CVPR 2021 Oral paper: A Recurrent Vision-and-Language BERT for Navigation
Other
150 stars 29 forks source link

Why don't you use ‘speaker’ during training? #12

Closed CrystalSixone closed 2 years ago

CrystalSixone commented 2 years ago

Hi! I don't see any codes about 'speaker', a useful way to make data augmentation for R2R. I am wondering why you delete the speaker part in your codes? Or have you done the experiments to show that using speaker doesn't work well in your method? Thanks a lot!

YicongHong commented 2 years ago

Hi There! We apply the PREVALENT augmented data in training which is produced by a Speaker model. We didn't apply the Speaker on the fly because we believe that using the pre-processed augmented datashould be similar to sample new data with a Speaker and greatly reduces training cost.

CrystalSixone commented 2 years ago

Oh, I see! Thanks for your kind reply 👍