CharacterCNN mask not use ?

helboukkouri / character-bert

Main repository for "CharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary Representations From Characters"

Apache License 2.0

195 stars 47 forks source link

CharacterCNN mask not use ? #1

Closed Shiro-LK closed 3 years ago

Shiro-LK commented 3 years ago

Hello,

I thank you for sharing the code for making your experiment reproductible. I have few question regarding the class CharacterCNN :

the variable mask is not used in the forward

same for these two variables :

   self._beginning_of_sentence_characters = torch.from_numpy(
        numpy.array(CharacterMapper.beginning_of_sentence_characters) + 1
    )
    self._end_of_sentence_characters = torch.from_numpy(
        numpy.array(CharacterMapper.end_of_sentence_characters) + 1
    )

is it normal ?

helboukkouri commented 3 years ago

Hi,

Thank you for bringing this to my attention.

The code for the CharacterCNN is largely inspired from the code from ELMo: https://github.com/allenai/allennlp/blob/master/allennlp/modules/elmo.py#L303 So there are some things that I forgot to comment out since they are not needed anymore. There is no more need for the EOS and BOS to be added implicitly during the forward pass since the model expects (just like BERT) the input to be explicitly wrapped in [CLS] and [SEP]. Same thing for the mask variable, you can just discard it.

Sorry if the code is not perfectly clean, but it should work as intended :)

helboukkouri commented 3 years ago

@Shiro-LK I realise that maybe I wasn't very clear about the reasons for discarding the mask variable. The reason is that the token mask is computed in advance as you do in BERT when building token_ids, segment_ids etc.. See: https://github.com/helboukkouri/character-bert/blob/main/utils/data.py#L148 so it's done explicitly in advance instead of implicitly during the forward :)