intel / intel-extension-for-pytorch

A Python package for extending the official PyTorch that can easily obtain performance on Intel platform
Apache License 2.0
1.56k stars 237 forks source link

Docker Image for v1.13.10+xpu? #288

Closed BA8F0D39 closed 1 year ago

BA8F0D39 commented 1 year ago

torch: 1.10.0a0+git3d5f2d4 ipex: 1.10.200+gpu works perfectly for Stable Diffusion in the Intel Pytorch Docker Container. It can use f16 precision to generate any image.

docker run -v /opt2/test/:/workspace \
           -v /dev/dri/by-path:/dev/dri/by-path \
           --device /dev/dri \
           --privileged \
           -it \
           intel/intel-optimized-pytorch:gpu bash
pip install diffusers ftfy transformers
import intel_extension_for_pytorch
import torch
from diffusers import StableDiffusionPipeline

model_id="runwayml/stable-diffusion-v1-5"
prompt = "vivid red hot air ballons over paris in the evening"
pipe = StableDiffusionPipeline.from_pretrained(
    model_id,
    torch_dtype=torch.float16,  # this can be torch.float32 as well
    revision="fp16",
    use_auth_token="<the token you generated>")
pipe = pipe.to("xpu")
image = pipe(prompt).images[0]
image.save(f"{prompt[:5]}.png")

However, upgrading to v1.13.10+xpu gives garbled and sometimes blank images

apt update
apt dist-upgrade
python -m pip install torch==1.13.0a0 -f https://developer.intel.com/ipex-whl-stable-xpu
python -m pip install intel_extension_for_pytorch==1.13.10+xpu -f https://developer.intel.com/ipex-whl-stable-xpu

There are no error messages, but I think something is linked incorrectly or f16 computations have changed going from 1.10.200+gpu to v1.13.10+xpu

Is it possible for intel to update the docker container?

jingxu10 commented 1 year ago

Hi, we are working on updating the docker image for 1.13.10+xpu, but it takes time. Meanwhile, would you build an image following instructions at https://github.com/intel/intel-extension-for-pytorch/tree/v1.13.10%2Bxpu/docker? Thank you.

BA8F0D39 commented 1 year ago

@jingxu10 Which one works for A770? ipex flex or ipex max?

jingxu10 commented 1 year ago

ipex flex

BA8F0D39 commented 1 year ago

@jingxu10 Both ipex flex and ipex max v1.13.10+xpu Docker Image do not have a working torchvision.

Is torchvision broken?

IMAGE_NAME=intel-extension-for-pytorch:xpu-flex

root@a147d2e06703:/# python -c "import torch; import intel_extension_for_pytorch as ipex; print(torch.__version__); print(ipex.__version__); [print(f'[{i}]: {ipex.xpu.get_device_properties(i)}') for i in range(ipex.xpu.device_count())];"
[W OperatorEntry.cpp:150] Warning: Overriding a previously registered kernel for the same operator and the same dispatch key
  operator: torchvision::nms
    no debug info
  dispatch key: CPU
  previous kernel: registered at /build/intel-pytorch-extension/csrc/cpu/aten/TorchVisionNms.cpp:47
       new kernel: registered at /opt/workspace/vision/torchvision/csrc/ops/cpu/nms_kernel.cpp:112 (function registerKernel)
/usr/local/lib/python3.10/dist-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: 
  warn(f"Failed to load image Python extension: {e}")
1.13.0a0+gitb1dde16
1.13.10+xpu
[0]: _DeviceProperties(name='Intel(R) Graphics [0x56a0]', platform_name='Intel(R) Level-Zero', dev_type='gpu, support_fp64=0, total_memory=15473MB, max_compute_units=512)

root@a147d2e06703:/# python -c "import torch; import torchvision; print(torch.__version__); print(torchvision.__version__); "
/usr/local/lib/python3.10/dist-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: 
  warn(f"Failed to load image Python extension: {e}")
1.13.0a0+gitb1dde16
0.14.1a0+0504df5

But torch: 1.10.0a0+git3d5f2d4 ipex: 1.10.200+gpu works with torchvision

jingxu10 commented 1 year ago

torchvision is there in the docker image. https://github.com/intel/intel-extension-for-pytorch/blob/v1.13.10%2Bxpu/docker/Dockerfile.ipex-flex-xpu#L105 https://github.com/intel/intel-extension-for-pytorch/blob/v1.13.10%2Bxpu/docker/build.sh#L21

You also have dumped the torchvision version out.

root@a147d2e06703:/# python -c "import torch; import torchvision; print(torch.__version__); print(torchvision.__version__); "
/usr/local/lib/python3.10/dist-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: 
  warn(f"Failed to load image Python extension: {e}")
1.13.0a0+gitb1dde16
0.14.1a0+0504df5

You can omit those warning messages. They don't block execution of torchvision.