The readme says that the model trained at 12,500 ch/s across 4 nvidia pascal gpus.
Is this 12,500 per gpu, or in total for the 4 gpus.
I'm trying to benchmark the performance of my set up/trying to estimate how long I need to wait for the equivalent 1 month of training time mentioned in the paper.
The readme says that the model trained at 12,500 ch/s across 4 nvidia pascal gpus.
Is this 12,500 per gpu, or in total for the 4 gpus.
I'm trying to benchmark the performance of my set up/trying to estimate how long I need to wait for the equivalent 1 month of training time mentioned in the paper.