Closed Shiro-LK closed 3 years ago
Hi,
Thank you for bringing this to my attention.
The code for the CharacterCNN is largely inspired from the code from ELMo: https://github.com/allenai/allennlp/blob/master/allennlp/modules/elmo.py#L303 So there are some things that I forgot to comment out since they are not needed anymore. There is no more need for the EOS and BOS to be added implicitly during the forward pass since the model expects (just like BERT) the input to be explicitly wrapped in [CLS] and [SEP]. Same thing for the mask variable, you can just discard it.
Sorry if the code is not perfectly clean, but it should work as intended :)
@Shiro-LK I realise that maybe I wasn't very clear about the reasons for discarding the mask variable. The reason is that the token mask is computed in advance as you do in BERT when building token_ids, segment_ids etc.. See: https://github.com/helboukkouri/character-bert/blob/main/utils/data.py#L148 so it's done explicitly in advance instead of implicitly during the forward :)
Hello,
I thank you for sharing the code for making your experiment reproductible. I have few question regarding the class CharacterCNN :
is it normal ?