ROCm / triton

Development repository for the Triton language and compiler
MIT License
96 stars 29 forks source link

[Issue]: triton fails to compile due to non-existent llvm tarball #631

Closed prarit closed 3 months ago

prarit commented 3 months ago

Problem Description

llvm-49af6502-ubuntu-x64.tar.gz is no longer available as a download.

[I'm not sure if this belongs in aotriton or if it belongs here in this triton fork. Ultimately the issue is with the triton compile requested by aotriton so I decided to file it here. If that's a mistake, please let me know and I'll open an issue with aotriton.]

Operating System

RHEL9, Fedora (likely all OSes fwiw)

CPU

Any CPU

GPU

AMD Instinct MI300X

ROCm Version

ROCm 6.1.0

ROCm Component

ROCm/triton

Steps to Reproduce

An aotriton compile points this triton repo which, when compiled, does

2024-08-18T13:02:58,456   downloading and extracting https://github.com/pybind/pybind11/archive/refs/tags/v2.11.1.tar.gz ...
2024-08-18T13:02:58,456   downloading and extracting https://tritonlang.blob.core.windows.net/llvm-builds/llvm-49af6502-ubuntu-x64.tar.gz ...
2024-08-18T13:02:58,457   error: HTTP Error 409: Public access is not permitted on this storage account.

This download reference comes from the upstream triton repo https://github.com/triton-lang/triton/tree/release/2.3.x which now has moved to hash 5e5a22caf88ac1ccfa8dc5720295fdeba0ad9372 which is currently available at the download above (see https://github.com/triton-lang/triton/issues/4527).

However, the ROCm/triton repository (this repository) still references the old 49af6502 package, which no longer exists at that download location -- and the above error is seen when attempting to compile torch, aotriton, or triton from this repository.

To reproduce, build one of torch-2.3.1, aotriton (https://github.com/ROCm/aotriton.git) , or triton (from this repo, not triton-lang).

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

ROCk module is NOT loaded, possibly no GPU devices (I will note that I am compiling and not executing on this system)

Additional Information

No response

lisaong commented 3 months ago

This change needs to be cherry-picked on this fork of triton: https://github.com/triton-lang/triton/commit/06e6799f4eba6035ec35c528e8fefd3d4d724b6f

Even so, the new blob store may not contain all of the older LLVM drops, so it may only resolve the build issues for newer versions of triton.

unclemusclez commented 3 months ago

Even so, the new blob store may not contain all of the older LLVM drops, so it may only resolve the build issues for newer versions of triton.

https://oaitriton.blob.core.windows.net/public/llvm-builds/llvm-ce80c80d-ubuntu-x64.tar.gz << stolen from triton-lang/triton/python/setup.py on ubuntu 22.04 and it doesn't work.. i believe you are correct

prarit commented 3 months ago

This is what I did:

git clone https://github.com/llvm/llvm-project
(cd llvm-project; git checkout 49af6502c6dcb4a7f7520178bd14df396f78240c;)
(cd llvm-project; cmake -G Ninja -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON ../llvm -DLLVM_ENABLE_PROJECTS="mlir;llvm" -DLLVM_TARGETS_TO_BUILD="AMDGPU" -DCMAKE_C_COMPILER=/usr/bin/gcc -DCMAKE_CXX_COMPILER=/usr/bin/g++ -DCMAKE_ASM_COMPILER=/usr/bin/gcc; ninja)

followed by setting the following environment variables for torch

LLVM_BUILD_DIR=/path/to/your/llvm-project/build
LLVM_INCLUDE_DIRS=$LLVM_BUILD_DIR/include
LLVM_LIBRARY_DIR=$LLVM_BUILD_DIR/lib
LLVM_SYSPATH=$LLVM_BUILD_DIR

and then building torch.

I based this on

https://github.com/triton-lang/triton?tab=readme-ov-file#building-with-a-custom-llvm

The above works for me on RHEL9.4.