About training time - Githubissues

ghost commented 6 years ago

Hi, I have a Tesla K80, and the given example has been running for more than 10 hours, and I want to know if it's normal and how much time does the training process usually take if there is a K80 gpu?

hidasib commented 6 years ago

I think it is normal. I use a 1080Ti and the example training on RSC15 finishes in about 2 hours. On the older Titan X (Maxwell architecture) it takes 4-5 hours. This is with an additional speed-up technique which has not been pushed to the public repo yet*. The K80 is a fairly old GPU, so 10 hours with the public code sounds beliveable to me.

If you want to speed-up training and don't mind losing some accuracy, set batch_size to 64 (instead of 32) and n_epochs to 5 (instead of 10). This will essentially make training 4 times faster. With this setting I get 0.7198 for recall@20 and 0.3074 for MRR@20 (instead of 0.7261 and 0.3124), but training takes only 30 minutes on the 1080Ti.

*The speed-up will be published as soon as we can finalize the license text for the code.

ghost commented 6 years ago

Thanks for the quick and detailed response, I'll try that.

hidasib / GRU4Rec

About training time #24