rgsachin / CoVe

Keras implementation of CoVe
BSD 3-Clause "New" or "Revised" License
51 stars 11 forks source link

Context Vectors (CoVe)

This repo provides the Keras implementation of MT-LSTM from the paper Learned in Translation: Contextualized Word Vectors (McCann et. al. 2017) . For a high-level overview of why CoVe are great, check out the post.

The Weights are ported from PyTorch implementation of MT-LSTM by the paper's authur - https://github.com/salesforce/cove

Dependencies

Ported & tested on:

For re-running PortFromPytorchToKeras.ipynb requires of PyTorch MT-LSTM implementation from site: https://github.com/salesforce/cove

Usage

Loading model

from keras.models import load_model
cove_model = load_model('Keras_CoVe.h5')

Prediction

Padding

At the time of porting, keras has issue with using Masking along with Bidirectional layer - https://github.com/keras-team/keras/issues/3086 ,a short-cut fix is applied, where the output of the final Bi-LSTM is removed off of prediction for padded field, refer PortFromPytorchToKeras.ipynb for the shortcut fix

Unknow words

For unknown words we recommend to use value other than ones used for padding, a small non-zero value say 1e-10 is recommended

Implementation Details

Reference

  1. MT-LSTM from the paper Learned in Translation: Contextualized Word Vectors (McCann et. al. 2017)
  2. MT-LSTM PyTorch implementation from which weights are ported
  3. GloVe