This repo provides the Keras implementation of MT-LSTM from the paper Learned in Translation: Contextualized Word Vectors (McCann et. al. 2017) . For a high-level overview of why CoVe are great, check out the post.
The Weights are ported from PyTorch implementation of MT-LSTM by the paper's authur - https://github.com/salesforce/cove
Ported & tested on:
For re-running PortFromPytorchToKeras.ipynb requires of PyTorch MT-LSTM implementation from site: https://github.com/salesforce/cove
from keras.models import load_model
cove_model = load_model('Keras_CoVe.h5')
cove_model.predict(np.random.rand(1,10,300))
At the time of porting, keras has issue with using Masking along with Bidirectional layer - https://github.com/keras-team/keras/issues/3086 ,a short-cut fix is applied, where the output of the final Bi-LSTM is removed off of prediction for padded field, refer PortFromPytorchToKeras.ipynb for the shortcut fix
For unknown words we recommend to use value other than ones used for padding, a small non-zero value say 1e-10 is recommended