linjieli222 / HERO

Research code for EMNLP 2020 paper "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"
https://arxiv.org/abs/2005.00200
MIT License
230 stars 34 forks source link

question about vocab size #25

Closed taeho-kil closed 3 years ago

taeho-kil commented 3 years ago

In pre-train configuration file "hero_pretrain.json", the vocab size of f_config is 50,265 (it may be from RoBERTa model).

However, The pre-trained model 'hero-tv-ht100.pt" has the vocab size of f_config as 50,272 (I check the dimension of the model.v_encoder.f_encoder.lm_head.decoder)

When the 'hero-tv-ht100.pt' model is trained, which configuration file is used?

linjieli222 commented 3 years ago

@xellows1305

Thank you for your interests in our project and sorry for the late response.

When we pre-train the model, ht_pretrain.json is used as the config file. The vocab size change comes from https://github.com/linjieli222/HERO/blob/f938515424b5f3249fc1d2e7f0373f64112a6529/model/model.py#L363

In pad_vocab(), we pad the word embeddings to be multiple of 8 to fully utilize the tensor cores in our GPUs.

You can also refer to this function, where the padding is implemented at: https://github.com/linjieli222/HERO/blob/f938515424b5f3249fc1d2e7f0373f64112a6529/model/modeling_utils.py#L124-L135

Thanks.

linjieli222 commented 3 years ago

Closed due to inactivity.