Dev environment based on `conda-forge`

ZzEeKkAa commented 3 months ago

I'm opening this issue to track the progress on enabling project development within conda environment based on the conda-forge channel without external dependencies.

Current working solution:

Create environment from file:

channels:
- https://prefix.dev/conda-forge
- nodefaults
dependencies:
# complier toolchain
- gcc_linux-64=12  # [linux]
- gxx_linux-64=12  # [linux]
- libgcc-ng=12
- libstdcxx-ng=12
- libgfortran5=12
- libgfortran-ng=12
- dpcpp_linux-64>=2024.2.0
# set version as close as possible to you host system
- sysroot_linux-64=2.28 #[linux]
# build tools
- cmake
- ninja
# llvm dependencies
- zlib
- zstd
- libxml2
- python=3.10
# enhancements
#  enable caching between builds and for the fresh builds
- ccache
#  enable faster linking
- lld
# ocl loader + level_zero
- ocl-icd
- level-zero-devel
# mkl for ipex
- mkl-include
- mkl-static
# pytorch
- python
- numpy<2
- pip
- setuptools
- pyyaml
- requests
- future
- six
# - mkl-devel
# - libcblas * *_mkl
- libgomp=12
- libabseil
- libprotobuf
- sleef
- typing
- libuv
- pkg-config
- typing_extensions
# ipex
- psutil
- numpy
- packaging
# triton
- pydot
- pyyaml
- matplotlib
- numpy
- pandas
- textx
- multiprocess
# tests
- pytest
- pytest-xdist
- pytest-rerunfailures
# end triton
# scipy
- scipy
# pytorch
#- pytorch
- pip
- pip:
# triton
- caliper-reader
- llnl-hatchet
# tests
- pytest-select
# pytorch
#- intel-extension-for-pytorch

Build pytorch

> cd pytorch
VERBOSE=1 DEBUG=1 USE_DISTRIBUTED=1 USE_XPU=1 USE_MKLDNN=1 USE_CUDA=0 BUILD_TEST=0 USE_FBGEMM=0 USE_NNPACK=0 USE_QNNPACK=0 USE_XNNPACK=0 CC=x86_64-conda-linux-gnu-gcc CXX=x86_64-conda-linux-gnu-g++ SYCL_ROOT=$CONDA_PREFIX INTEL_MKL_DIR=$CONDA_PREFIX INTEL_COMPILER_DIR=$CONDA_PREFIX LD_LIBRARY_PATH=$CONDA_PREFIX/lib BUILD_CUSTOM_PROTOBUF=0 pip install -e . -v --no-build-isolation

Build triton

cd triton/python
VERBOSE=1 DEBUG=1 CC=x86_64-conda-linux-gnu-gcc CXX=x86_64-conda-linux-gnu-g++ pip install -e . -v --no-build-isolation

4.0. Create fake IPEX:

mkdir $CONDA_PREFIX/lib/python3.10/site-packages/intel_extension_for_pytorch

This step may be skipped after #925

Test triton

echo "test/unit/language/test_core.py::test_abs_fp8[in_dtype2]" > scripts/skiplist/default/all.txt && cat scripts/skiplist/default/language.txt >> scripts/skiplist/default/all.txt && cat scripts/skiplist/default/subprocess.txt >> scripts/skiplist/default/all.txt
CPATH=$CONDA_PREFIX/include/sycl:$CONDA_PREFIX/include:$CPATH CC=icx CXX=icpx TRITON_TEST_SUITE=language \
pytest -vvv -n 8 --device xpu test/unit/language/ --deselect-from-file=../scripts/skiplist/default/all.txt

4.1 lit tests

You need to install lit and llvmdev conda packages

CPATH=$CONDA_PREFIX/include/sycl:$CONDA_PREFIX/include:$CPATH PATH=$PATH:$CONDA_PREFIX/libexec/llvm CC=icx CXX=icpx lit -v build/*/test

Blockers and TODOs:

https://github.com/conda-forge/intel-compiler-repack-feedstock/issues/31
https://github.com/conda-forge/intel-compiler-repack-feedstock/issues/32
https://github.com/conda-forge/intel_repack-feedstock/issues/70
[ ] Distribute pti-gpu to conda-forge
[ ] Add caliper-reader, llnl-hatchet, pytest-select to conda-forge
[ ] Pick up level-zero package from conda environment (currently it pulls and builds)
[ ] Pick up intel's compiler and libraries from conda environment. Current workaround: export CPATH=$CONDA_PREFIX/include:$CONDA_PREFIX/include/sycl:$CPATH CC=icx CXX=icpx

Nice to haves

[ ] Different versions IGC to test updates instantly
[ ] Avoid loading nvidia compiler from setup.py

Workarounds for the upstream

These two PRs need to be apply to achieve same compatibility and performance with IPEX:

Other issues

If you have ssl version mismatch during build, install openssh conda package.

pbchekin commented 3 months ago

Is it for build or runtime environment?

pbchekin commented 3 months ago

CC @leshikus

ZzEeKkAa commented 3 months ago

Is it for build or runtime environment?

For development environment, in other words for both.

vlad-penkin commented 1 month ago

@leshikus, @pbchekin, @ZzEeKkAa PTDB requires special ABI neutral mkl build which is not distributed via intel (-c https://software.repos.intel.com/python/conda/ ) or conda-forge channels. Let's revisit this ticket after 2025.0 is released.

ZzEeKkAa commented 1 month ago

I agree that in current state it is blocked by 2025.0 release.

ZzEeKkAa commented 2 weeks ago

2025 is coming soon, but there is actually a workaround to compile pytorch with USE_STATIC_MKL=1. However it must be pulled from PTDB distribution.

ZzEeKkAa commented 2 weeks ago

I've updated blockers with missing pti-gpu distribution.

intel / intel-xpu-backend-for-triton