Why Transformer Decoder?

rafiepour / CTran

Complete code for the proposed CNN-Transformer model for natural language understanding.

https://github.com/rafiepour/CTran

Apache License 2.0

23 stars 2 forks source link

Why Transformer Decoder? #4

Closed annanfree closed 2 months ago

annanfree commented 3 months ago

why use transformer decoder for slots filling，why not something more simple. what's the reason or intuition for using the structure

rafiepour commented 2 months ago

Hi @annanfree, The decoder needs to have a history of what it has already generated. The intuition was to use a network that had a memory, but did not have the limitation of LSTMs. Regular Transformer Decoder did not have a proper performance because of the alignment issue, and so , we introduced the aligned Transformer Decoder. I highly suggest you to read the sciencedirect paper meticulously.