rafiepour / CTran

Complete code for the proposed CNN-Transformer model for natural language understanding.
https://github.com/rafiepour/CTran
Apache License 2.0
23 stars 2 forks source link
atis cnn-transformer dialogue-systems encoder-decoder intent intent-detection natural-language-understanding nlu slot-filling snips transformer-decoder transformers

CTRAN: CNN-Transformer-based Network for Natural Language Understanding

The pytorch implementation as described in https://www.sciencedirect.com/science/article/pii/S0952197623011971.

PWCPWC
PWCPWC

Introduction

CTran CNN Transformer Model Architecture This repository contains the complete source code of the proposed CTRAN network for joint intent detection and slot filling, which are the main tasks of natural language understanding. We propose a encoder-decoder modele, combining CNNs with Transformers and define the concept of alignment in Transformer decoder for the first time. The encoder is shared between the two intent detection and slot filling tasks, hence the model benefits from implied dependency of two tasks. In CTRAN's shared encoder, BERT is used as word embedding. What follows is the convolutional operation on word embeddings, which is then accompanied by the "window feature sequence" structure, essentially transposing the output to a certain shape instead of using pooling operations. We decided on using a stack of Transformer encoders to create a new contextualized representation of the input which also considers the new embedding generated by the CNN. The decoder comprises self-attention and a linear layer to produce output probabilities for intent detection. Furthermore, we propose the novel Aligned Transformer decoder, followed by a fully connected layer for the slot filling task. For more information, please refer to the EAAI's article.

Requirements