Closed tcapelle closed 10 months ago
Hi, Small PR to add the missing warmup and the total number of steps so the training happens correctly. I am also adding info on the GPU requirements ( 80GB Gpus ). <- this is on the main readme =P
The link to the experiment
Hi, Small PR to add the missing warmup and the total number of steps so the training happens correctly. I am also adding info on the GPU requirements ( 80GB Gpus ). <- this is on the main readme =P
The link to the experiment