i have read this paper https://arxiv.org/pdf/1804.04637.pdf and it says that the training model takes 3 hours, but when i ran the code, it took me only 20 minutes to do, so i don't know what is the mistake i meet here. Can anyone help me answer this ?
i have read this paper https://arxiv.org/pdf/1804.04637.pdf and it says that the training model takes 3 hours, but when i ran the code, it took me only 20 minutes to do, so i don't know what is the mistake i meet here. Can anyone help me answer this ?