ArvinZhuang / DSI-transformers

A huggingface transformers implementation of "Transformer Memory as a Differentiable Search Index"
MIT License
163 stars 14 forks source link

How long does it take to train the model with 1 V100 GPU? #3

Closed rxlian closed 2 years ago

ArvinZhuang commented 2 years ago

Hi in my example code I do evaluation on training and test set after every epoch, which is very time-consuming and it takes a couple of days to converge. But you can comment out the eval on training and make eval on test to like after every 10 epochs. On my side, this will converge around 3 days.