thushv89 / attention_keras

Keras Layer implementation of Attention for Sequential models
https://towardsdatascience.com/light-on-math-ml-attention-with-keras-dc8dbc1fad39
MIT License
443 stars 266 forks source link
deep-learning keras lstm rnn tensorflow

TensorFlow (Keras) Attention Layer for RNN based models

![Build Status (CircleCI)](https://circleci.com/gh/circleci/circleci-docs.svg?style=sheild)

Version (s)

Introduction

This is an implementation of Attention (only supports Bahdanau Attention right now)

Project structure

data (Download data and place it here)
 |--- small_vocab_en.txt
 |--- small_vocab_fr.txt
src
 |--- layers
       |--- attention.py (Attention implementation)
 |--- examples
       |--- nmt
             |--- model.py (NMT model defined with Attention)
             |--- train.py ( Code for training/inferring/plotting attention with NMT model)
       |--- nmt_bidirectional
             |--- model.py (NMT birectional model defined with Attention)
             |--- train.py ( Code for training/inferring/plotting attention with NMT model)

How to use

Just like you would use any other tensoflow.python.keras.layers object.

from attention_keras.src.layers.attention import AttentionLayer

attn_layer = AttentionLayer(name='attention_layer')
attn_out, attn_states = attn_layer([encoder_outputs, decoder_outputs])

Here,

Visualizing Attention weights

An example of attention weights can be seen in model.train_nmt.py

After the model trained attention result should look like below.

Attention heatmap

Running the NMT example

Prerequisites

Using the docker image

Using a virtual environment

Running the code

If you would like to show support

If you'd like to show your appreciation you can buy me a coffee. No stress! It's totally optional. The support I recieved would definitely an added benefit to maintain the repository and continue on my other contributions.


If you have improvements (e.g. other attention mechanisms), contributions are welcome!