Updated the transformer greedy method

graykode / nlp-tutorial

Natural Language Processing Tutorial for Deep Learning Researchers

https://www.reddit.com/r/MachineLearning/comments/amfinl/project_nlptutoral_repository_who_is_studying/

MIT License

14.32k stars 3.95k forks source link

Updated the transformer greedy method #7

Closed dmmiller612 closed 5 years ago

dmmiller612 commented 5 years ago

In this PR, I added a greedy decoder function that generates the decoder input for inference. This is important for translating sentences as we don't know the target input beforehand. In the paper, they mentioned that they ran Beam Search with a k=4. In the greedy approach, k = 1.

graykode commented 5 years ago

Thanks you @dmmiller612 Nice Work

graykode commented 5 years ago

@dmmiller612 Hello. I added 'Masked' Multi head attention using torch.triu and edited your greedy decoder code in my new branch(Transformer)! Are you agree? Please See my diff commit 1) 'Masked' Multi head attention in your greedy decoder https://github.com/graykode/nlp-tutorial/commit/005d34bfa3cafb822599a526b85b732e2846213d

2) 'Masked' Multi head attention in Original Transformer https://github.com/graykode/nlp-tutorial/commit/5b4fb5ebca712f72e747674333dafc10182367e5

Thanks