Which paper or article do you refer from? attention is all you need

Hi, I was very impressed with this repository when I was doing the research about applying transformer(attention is all you need) to the task of time series forecasting. I have a question. The implementation of "attention is all you need" has encoder and decoder mechanism to deal with the translation task. However your implementation in this repository has only encoder mechanism. I wondered if there are any other papers or articles you referenced. If there is, could you please tell me? Also, I want to ask you one more question, why your implementation has fewer dropout layers unlike original implementation in "attention is all you need"

huseinzol05 / Stock-Prediction-Models

Which paper or article do you refer from? attention is all you need #66