intel / intel-xpu-backend-for-triton

OpenAI Triton backend for Intel® GPUs
MIT License
129 stars 38 forks source link

Dev environment based on `conda-forge` #1552

Open ZzEeKkAa opened 3 months ago

ZzEeKkAa commented 3 months ago

I'm opening this issue to track the progress on enabling project development within conda environment based on the conda-forge channel without external dependencies.

Current working solution:

  1. Create environment from file:

    channels:
    - https://prefix.dev/conda-forge
    - nodefaults
    dependencies:
    # complier toolchain
    - gcc_linux-64=12  # [linux]
    - gxx_linux-64=12  # [linux]
    - libgcc-ng=12
    - libstdcxx-ng=12
    - libgfortran5=12
    - libgfortran-ng=12
    - dpcpp_linux-64>=2024.2.0
    # set version as close as possible to you host system
    - sysroot_linux-64=2.28 #[linux]
    # build tools
    - cmake
    - ninja
    # llvm dependencies
    - zlib
    - zstd
    - libxml2
    - python=3.10
    # enhancements
    #  enable caching between builds and for the fresh builds
    - ccache
    #  enable faster linking
    - lld
    # ocl loader + level_zero
    - ocl-icd
    - level-zero-devel
    # mkl for ipex
    - mkl-include
    - mkl-static
    # pytorch
    - python
    - numpy<2
    - pip
    - setuptools
    - pyyaml
    - requests
    - future
    - six
    # - mkl-devel
    # - libcblas * *_mkl
    - libgomp=12
    - libabseil
    - libprotobuf
    - sleef
    - typing
    - libuv
    - pkg-config
    - typing_extensions
    # ipex
    - psutil
    - numpy
    - packaging
    # triton
    - pydot
    - pyyaml
    - matplotlib
    - numpy
    - pandas
    - textx
    - multiprocess
    # tests
    - pytest
    - pytest-xdist
    - pytest-rerunfailures
    # end triton
    # scipy
    - scipy
    # pytorch
    #- pytorch
    - pip
    - pip:
    # triton
    - caliper-reader
    - llnl-hatchet
    # tests
    - pytest-select
    # pytorch
    #- intel-extension-for-pytorch
  2. Build pytorch

    > cd pytorch
    VERBOSE=1 DEBUG=1 USE_DISTRIBUTED=1 USE_XPU=1 USE_MKLDNN=1 USE_CUDA=0 BUILD_TEST=0 USE_FBGEMM=0 USE_NNPACK=0 USE_QNNPACK=0 USE_XNNPACK=0 CC=x86_64-conda-linux-gnu-gcc CXX=x86_64-conda-linux-gnu-g++ SYCL_ROOT=$CONDA_PREFIX INTEL_MKL_DIR=$CONDA_PREFIX INTEL_COMPILER_DIR=$CONDA_PREFIX LD_LIBRARY_PATH=$CONDA_PREFIX/lib BUILD_CUSTOM_PROTOBUF=0 pip install -e . -v --no-build-isolation
  3. Build triton

    cd triton/python
    VERBOSE=1 DEBUG=1 CC=x86_64-conda-linux-gnu-gcc CXX=x86_64-conda-linux-gnu-g++ pip install -e . -v --no-build-isolation

4.0. Create fake IPEX:

mkdir $CONDA_PREFIX/lib/python3.10/site-packages/intel_extension_for_pytorch

This step may be skipped after #925

  1. Test triton
    echo "test/unit/language/test_core.py::test_abs_fp8[in_dtype2]" > scripts/skiplist/default/all.txt && cat scripts/skiplist/default/language.txt >> scripts/skiplist/default/all.txt && cat scripts/skiplist/default/subprocess.txt >> scripts/skiplist/default/all.txt
    CPATH=$CONDA_PREFIX/include/sycl:$CONDA_PREFIX/include:$CPATH CC=icx CXX=icpx TRITON_TEST_SUITE=language \
    pytest -vvv -n 8 --device xpu test/unit/language/ --deselect-from-file=../scripts/skiplist/default/all.txt

4.1 lit tests

You need to install lit and llvmdev conda packages

CPATH=$CONDA_PREFIX/include/sycl:$CONDA_PREFIX/include:$CPATH PATH=$PATH:$CONDA_PREFIX/libexec/llvm CC=icx CXX=icpx lit -v build/*/test

Blockers and TODOs:

Nice to haves

Workarounds for the upstream

These two PRs need to be apply to achieve same compatibility and performance with IPEX:

Other issues

pbchekin commented 3 months ago

Is it for build or runtime environment?

pbchekin commented 3 months ago

CC @leshikus

ZzEeKkAa commented 3 months ago

Is it for build or runtime environment?

For development environment, in other words for both.

vlad-penkin commented 1 month ago

@leshikus, @pbchekin, @ZzEeKkAa PTDB requires special ABI neutral mkl build which is not distributed via intel (-c https://software.repos.intel.com/python/conda/ ) or conda-forge channels. Let's revisit this ticket after 2025.0 is released.

ZzEeKkAa commented 1 month ago

I agree that in current state it is blocked by 2025.0 release.

ZzEeKkAa commented 2 weeks ago

2025 is coming soon, but there is actually a workaround to compile pytorch with USE_STATIC_MKL=1. However it must be pulled from PTDB distribution.

ZzEeKkAa commented 2 weeks ago

I've updated blockers with missing pti-gpu distribution.