antoyang / just-ask

[ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos
https://arxiv.org/abs/2012.00451
Apache License 2.0
117 stars 15 forks source link

Question about the preprocess of LSMDC-FiB dataset #7

Closed vateye closed 2 years ago

vateye commented 2 years ago

Hi, I have read your paper "FrozenBiLM". I have several question about the preprocess of LSMDC-FiB dataset. Since I noticed that there are some blanks only contains a part of the word. For example "I went to the place w___e I live." The answer would be "her". Therefore, there exists a problem that the semantic meaning of the question have been destroyed. I am wondering how do you treat these type of questions?

Thanks.

antoyang commented 2 years ago

Hi, I did not do any specific preprocessing for this case. In all cases, I replaced the blank with a [MASK] token.

vateye commented 2 years ago

So, you treat the problem as "I went to the place w[MASK]e I live." ?

antoyang commented 2 years ago

Yes.

vateye commented 2 years ago

Thanks.