hzeng-otterai / nlp

Deep NLP algorithms based on PyTorch and AllenNLP
MIT License
3 stars 4 forks source link

Improved BiMPM in AllenNLP #4

Closed dcfidalgo closed 4 years ago

dcfidalgo commented 4 years ago

@handsomezebra Sorry to bother you via this channel, i thought it would be more adequate to use an issue here than emailing you directly.

We are starting to use your BiMPM implementation in AllenNLP. Compared to the paper (https://arxiv.org/pdf/1702.03814.pdf) you not only pass on the final context representation to the matching layer, but also the "intermediate context representation" (encoder1) and the word embeddings. What was your motivation? Did you do some comparison between these settings?

Sorry again to bother you with this "old" stuff. Have a great day!

hzeng-otterai commented 4 years ago

Hi, Sorry for late response. I totally missed it.

For the question, I didn't do a the comparison. But usually when there is enough data, the more you include in the representation you can get better results. This is also the case in other models, for example, when you use BERT as features. There are many layers in BERT model, and when you use more layers you can get better results.