Open ghost opened 6 years ago
I would like to know if someone has tried implementing transformer network in this framework.
I am talking about this paper:
https://arxiv.org/abs/1706.03762
I've noticed OpenNMT have used this architecture and improved their results.
I would like to know if someone has tried implementing transformer network in this framework.
I am talking about this paper:
https://arxiv.org/abs/1706.03762
I've noticed OpenNMT have used this architecture and improved their results.