Open Ryan-Vereque opened 1 month ago
In the paper during the primary discussed training procedure, was the train time per-batch (which was batch-size of 32) closer to 127.78 ms or that value times 32 ?
In the paper during the primary discussed training procedure, was the train time per-batch (which was batch-size of 32) closer to 127.78 ms or that value times 32 ?