[Documentation] Update README, add notes on Intel specific tutorials

alexbaden commented 1 month ago

Might be easier to maintain if we put these notes in the individual tutorials, like upstream Triton does.

pbchekin commented 2 weeks ago

This issue does not have any description, and README already contains notes on running tutorials with Triton XPU. @vlad-penkin could you clarify what is in the scope?

vlad-penkin commented 2 weeks ago

This ticket was created based on the @dvrogozh feedback.

@dvrogozh feel free to elaborate more on the details.

dvrogozh commented 1 week ago

Fix sourcing of oneAPI environment variables. Your side description contradicts referenced prerequisite instruction - they ask to source /opt/intel/oneapi/pytorch-gpu-dev-0.5/oneapi-vars.sh, you source /opt/intel/oneapi/setvars.sh instead (this and other sections of readme below): https://github.com/intel/intel-xpu-backend-for-triton/blob/07190618bc2efcccc5db4e19c85d44b453823032/README.md?plain=1#L64-L70
Fix wording or align instructions between usual setup and custom LLVM setup. It currently does not build "as above" since for LLVM you use pip install python and for usual setup you use scripts/install-pytorch.sh --source. My 2 cents here is to avoid usage of custom scripts wrapping standard build commands. I.e. I would instruct users to always use pip install directly and moved description of your script somewhere to advanced usage documentation. https://github.com/intel/intel-xpu-backend-for-triton/blob/07190618bc2efcccc5db4e19c85d44b453823032/README.md?plain=1#L124
How about Pytorch 2.5? Will it also work out of the box? I suggest you can start having compatibility matrix in readme since not all triton releases will work with all pytorch releases - triton has instable API as far as I know. https://github.com/intel/intel-xpu-backend-for-triton/blob/07190618bc2efcccc5db4e19c85d44b453823032/README.md?plain=1#L226
I am hearing that you have modified versions of Triton tutorials. However I don't see a note on that and even link to your relevant version of tutorials. Readers might assume that they need to read https://triton-lang.org/main/getting-started/tutorials/index.html and use original Triton git repo for tutorials, etc. To avoid speculations I suggest to give a clear message if you have your own version of tutorials and how to use them. To which degree original triton documentation can be trusted. The problem here is that there is wrong expectation set for intel triton repository from the very beginning. It's titled "backend" which immediately gives an assumption that repo produces a plugin module for upstream triton, but instead it produces different version of entire triton and is actually a fork which is not marked as such on github level. After that readers might really get confused which upstream triton docs still hold and which do not for intel's version. Tutorials is one of the pieces here.
Changelog copied from original Triton and is not informative https://github.com/intel/intel-xpu-backend-for-triton/blob/07190618bc2efcccc5db4e19c85d44b453823032/README.md?plain=1#L333-L338
Will you support if someone will use intel version of triton on not-intended hardware? https://github.com/intel/intel-xpu-backend-for-triton/blob/07190618bc2efcccc5db4e19c85d44b453823032/README.md?plain=1#L351-L355

pbchekin commented 1 week ago

@dvrogozh, thank you for your feedback.

Fix sourcing of oneAPI environment variables. Your side description contradicts referenced prerequisite instruction - they ask to source /opt/intel/oneapi/pytorch-gpu-dev-0.5/oneapi-vars.sh, you source /opt/intel/oneapi/setvars.sh instead (this and other sections of readme below):

The instructions in README are correct and there is no contradiction. With source /opt/intel/oneapi/setvars.sh you are initializing all components installed in /opt/intel/oneapi while source /opt/intel/oneapi/pytorch-gpu-dev-0.5/oneapi-vars.sh initializes only PyTorch Prerequisites for Intel GPUs. If you read README carefully you will notice that PTI is also a prerequisite that need to be installed and initialized. So you have two options:

source /opt/intel/oneapi/pytorch-gpu-dev-0.5/oneapi-vars.sh
source /opt/intel/oneapi/pti/latest/env/vars.sh

or

source /opt/intel/oneapi/setvars.sh

The README uses the latter, which is obviously more user friendly than the former.

pbchekin commented 1 week ago

Fix wording or align instructions between usual setup and custom LLVM setup. It currently does not build "as above" since for LLVM you use pip install python and for usual setup you use scripts/install-pytorch.sh --source. My 2 cents here is to avoid usage of custom scripts wrapping standard build commands. I.e. I would instruct users to always use pip install directly and moved description of your script somewhere to advanced usage documentation.

In my opinion, this section does not need to be in README because most of Triton user will never use a custom LLVM. It is more reasonable to move it to a separate document. Meanwhile we are planning to update the section to be compatible with our build instructions.

pbchekin commented 1 week ago

How about Pytorch 2.5? Will it also work out of the box? I suggest you can start having compatibility matrix in readme since not all triton releases will work with all pytorch releases - triton has instable API as far as I know.

Good catch, this will be updated with the statement that our fork requires a special build of PyTorch, which can be built from sources or installed from nightly wheels.

Regarding Triton releases and compatibility matrix: since our fork does not have releases yet (except 3.0.0 beta releases) the only Triton XPU versions that need to be considered are from the most recent nightly wheels and built from sources from the top of the main branch. In both cases, Triton comes with a special version of PyTorch which is compatible with this version of Triton, so there is no compatibility matrix required.

pbchekin commented 1 week ago

Changelog copied from original Triton and is not informative

This is not correct, tips for building, tips for hacking are valid for Triton XPU. I do not see any reason to remove them from README.

pbchekin commented 1 week ago

Will you support if someone will use intel version of triton on not-intended hardware?

Good point. We discussed this internally and we will remove hardware that is not validated in our CI.

pbchekin commented 1 week ago

To avoid speculations I suggest to give a clear message if you have your own version of tutorials and how to use them.

Currently our README contains a modified code for one tutorial, but I agree that the wording can be more precise with respect of what changes are needed and where the tutorials are located.

pbchekin commented 1 week ago

The problem here is that there is wrong expectation set for intel triton repository from the very beginning. It's titled "backend" which immediately gives an assumption that repo produces a plugin module for upstream triton, but instead it produces different version of entire triton and is actually a fork which is not marked as such on github level. After that readers might really get confused which upstream triton docs still hold and which do not for intel's version. Tutorials is one of the pieces here.

I am hoping you have noticed that the project is still in progress, our goal is to have a Triton backend instead of a fork. I don't think it is reasonable to rename the repository to reflect the transitional state. Instead, the project and the repository name clearly indicate that the goal is to have a Triton backend.

dvrogozh commented 1 week ago

Changelog copied from original Triton and is not informative

This is not correct, tips for building, tips for hacking are valid for Triton XPU. I do not see any reason to remove them from README.

I did a typo pointing to text in readme. I did not mean tips section - it's perfectly fine and useful. I was talking about below change log section which starts with "Version 2.2 is out". https://github.com/intel/intel-xpu-backend-for-triton/blob/07190618bc2efcccc5db4e19c85d44b453823032/README.md?plain=1#L333-L338

dvrogozh commented 1 week ago

@pbchekin : thank you for the fixes.

intel / intel-xpu-backend-for-triton

[Documentation] Update README, add notes on Intel specific tutorials #1910