Closed silverstar194 closed 5 years ago
I ran my model for 6-8 weeks on my 1080 Ti. I created just about the largest model that I could fit in GPU memory and still run decent batch size and sequence lengths, so it took a long time to train. Entirely possible that you could get comparable results with a smaller model and less training time, or maybe the same model with better hyperparameter tuning.
Was that 6-8 weeks continues or off and on?
I would estimate it was running about 85-90% of the time.
How long is needed to train from scratch on fresh data? Given the ti 1080 GPU setup