triton-lang / triton

Development repository for the Triton language and compiler
https://triton-lang.org/
MIT License
12.63k stars 1.53k forks source link

triton does not seem to work on WSL (windows sub linux system) #4274

Open huynhducloi00 opened 2 months ago

huynhducloi00 commented 2 months ago

1) The tl.load(address) (to read data to SM unit) command in the "interpreter" mode returns all wrong value of the original tensor. The normal running mode (on real gpu) seem fine, but of course, we care about interpreter mode. 2) i thought I might need to rebuild triton from scratch on this wsl system, but it is failing:

cmake /home/manh/loi_major_build/triton -G Ninja -DCMAKE_MAKE_PROGRAM=/usr/bin/ninja -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DLLVM_ENABLE_WERROR=ON -DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/home/manh/loi_major_build/triton/python/build/lib.linux-x86_64-3.10/triton/_C -DTRITON_BUILD_TUTORIALS=OFF -DTRITON_BUILD_PYTHON_MODULE=ON -DPython3_EXECUTABLE:FILEPATH=/usr/bin/python3 -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON -DPYTHON_INCLUDE_DIRS=/usr/include/python3.10 -DTRITON_CODEGEN_BACKENDS=nvidia;amd -DTRITON_PLUGIN_DIRS= -DPYBIND11_INCLUDE_DIR=/home/manh/.triton/pybind11/pybind11-2.11.1/include -DLLVM_INCLUDE_DIRS=/home/manh/.triton/llvm/llvm-657ec732-ubuntu-x64/include -DLLVM_LIBRARY_DIR=/home/manh/.triton/llvm/llvm-657ec732-ubuntu-x64/lib -DCMAKE_BUILD_TYPE=TritonRelBuildWithAsserts -DJSON_INCLUDE_DIR=/home/manh/.triton/json//include -DPYBIND11_INCLUDE_DIR=/home/manh/.triton/pybind11/pybind11-2.11.1/include -DCUPTI_INCLUDE_DIR=/home/manh/loi_major_build/triton/third_party/nvidia/backend/include -DROCTRACER_INCLUDE_DIR=/home/manh/loi_major_build/triton/third_party/amd/backend/include

The only error message is "c++: fatal error: Killed signal terminated program cc1plus" Not sure how to debug with that.

The way that the setup.py is written, or cmake is run make it pretty hard to debug why the build process was failing.

chengzeyi commented 2 months ago

You are facing an OOM issue. Buy more RAM or increase your allowed memory usage of WSL, or limit the number of threads for compiling triton should help.

huynhducloi00 commented 1 month ago

yeah, but what about the quesiton about the interpret not working on WSL. i guess it is lower priority

Jokeren commented 1 month ago

The tl.load(address) (to read data to SM unit) command in the "interpreter" mode returns all wrong value of the original tensor. The normal running mode (on real gpu) seem fine, but of course, we care about interpreter mode.

Can you share your code and your command?