Closed hforoughmand closed 4 months ago
hey @hforoughmand, if you want to run the version of the tutorial at tip of main branch (also published on the triton website), I recommend building/installing Triton main branch from source, or installing a nightly build (v3.0-*) per readme instructions.
If you want to run a Triton stable release (2.x) installed from PyPI (or implicitly installed with PyTorch stable release install), then I recommend you run the version of the tutorial code in the corresponding release branch (which may be different from the website).
For mat mul on V100 specifically, you may want to review open issues referencing V100. I think there may be a known regression on it for FP16, see https://github.com/openai/triton/issues/3478.
When I run the test 03 (
https://triton-lang.org/main/getting-started/tutorials/03-matrix-multiplication.html#sphx-glr-getting-started-tutorials-03-matrix-multiplication-py
) on a V100 I get the following error.Is that a problem with the installation or a known bug? My cuda version is 11.8, my python version is 3.8.18, my pytorch version is 2.3.0+cu118.