Open jere357 opened 9 months ago
I ran 4 experiments on our ada6k, training vit_l_16 with different setups, trying to see how much this flags helps. Seems to be significant only for --precision 32 training.
float32_matmul_precision | --precision | img/s |
---|---|---|
highest (default) | 32 | 47 |
highest (default) | 16-mixed | 177 |
high | 32 | 94 |
high | 16-mixed | 176 |
@jere357 Can you create a PR for this?
You are using a CUDA device ('NVIDIA GeForce RTX 3060') that has Tensor Cores. To properly utilize them, you should set
torch.set_float32_matmul_precision('medium' | 'high')
which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision