Open ghostplant opened 4 months ago
in scripts/
there's the script I used for autotuning, feel free to try that
Now I get 11Tflops for 2080ti, and 17Tflops for A100, is that reasonable?
seems pretty reasonable to me. Depends on which A100 you have. In the post I quote some numbers that I got.
I just get 15TFlops on A100 (sm80), and 6TFlops on 2080ti (sm75).
If tuning properly, it should be able to get > 17TFlops for A100 and > 12Tflops for 2080ti, right?