clay-lab / transductions

A PyTorch framework for creating, running, and reproducing experiments on seq2seq models.
6 stars 3 forks source link

BERT #57

Open jopetty opened 3 years ago

jopetty commented 3 years ago

Would be nice to have BERT an option for the encoder. Some issues are:

jopetty commented 3 years ago

aedca6f adds a "working" (i.e., doesn't error) BERT model, but it doesn't seem to learn very well. Among the design considerations:

jopetty commented 3 years ago

It seems that the positional encodings built into the HuggingFace BERT models are not useful in a sequence to sequence context. I'm not really sure why this is, but it is fixable if we add our own positional encodings to the embedding layer of the pretrained models.