intel / intel-extension-for-pytorch

A Python package for extending the official PyTorch that can easily obtain performance on Intel platform
Apache License 2.0
1.57k stars 240 forks source link

Request support for pytorch lighting and provide installation packages for deepspeed #658

Closed uniartisan closed 2 months ago

uniartisan commented 3 months ago

Describe the issue

Dear Intel Team,

I am a user of PyTorch Lightning and am currently facing some issues. I hope Intel could provide support in this regard. The situation is as follows:

PyTorch Lightning currently does not support XPU, but it does support Intel's Gaudi accelerator. Could Intel please provide official support for XPU? Although there is a DeepSpeed plugin that offers some support, I encountered problems during the installation process. The related issues are listed below:

https://github.com/intel/intel-extension-for-deepspeed/issues/81

Since XPU has advantages in certain scenarios, providing official support for it would be greatly beneficial. If possible, not only could you provide packaged support for Lightning and DeepSpeed, but you could also consider merging them into the main PyTorch repository. This would not only provide a better experience for users but also help promote Intel's XPU technology.

I sincerely hope that the Intel team will pay attention to this issue. Thank you!

Best regards!

YuningQiu commented 3 months ago

Hello, many thanks for bringing this suggestion. We will take this into consideration, and let you know there any updates from our end. Thanks!

sophiehchen commented 3 months ago

We have a PR to extend lighting to support Intel xpu: https://github.com/Lightning-AI/pytorch-lightning/pull/17700 There is ongoing discussion with Lightning community to upstream this PR. Expecting it could unblock your work on Intel GPU.

congdm commented 3 months ago

Having official support for DeepSpeed would be great. Also please provide support for Windows version of IPEX, atm I cannot find an official build of Intel oneccl library for Windows, so basically no way to use torch FSDP.

YuningQiu commented 3 months ago

Hi @congdm, for you information, you can get Intel oneCCL library on windows OS by installing Intel oneAPI Base Toolkit https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html?operatingsystem=windows&windows-install-type=offline

congdm commented 3 months ago

Hi @congdm, for you information, you can get Intle onceCCL library on windows OS by installing Intel oneAPI Base Toolkit https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html?operatingsystem=windows&windows-install-type=offline

Sorry, but I have checked my full installation and there is no "ccl.h" in include folder at all. Also the information in this page https://www.intel.com/content/www/us/en/developer/articles/system-requirements/oneapi-collective-communication-library-system-requirements.html only shows support for Linux OSes

YuningQiu commented 3 months ago

Hi @congdm, for you information, you can get Intle onceCCL library on windows OS by installing Intel oneAPI Base Toolkit https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html?operatingsystem=windows&windows-install-type=offline

Sorry, but I have checked my full installation and there is no "ccl.h" in include folder at all. Also the information in this page https://www.intel.com/content/www/us/en/developer/articles/system-requirements/oneapi-collective-communication-library-system-requirements.html only shows support for Linux OSes

Yes, your are right on windows the oneCCL is not included as part of the Intel® oneAPI Base Toolkit.

ZupoLlask commented 3 months ago

@YuningQiu So, is this missing 'ccl.h' going to be fixed soon? Thanks.

YuningQiu commented 3 months ago

@YuningQiu So, is this missing 'ccl.h' going to be fixed soon? Thanks.

Hello, thanks for bringing up this question. May I ask why you are expecting ccl.h? If you have Deepspped, you should have ccl.hpp.

YuningQiu commented 3 months ago

@YuningQiu So, is this missing 'ccl.h' going to be fixed soon? Thanks.

And could you please raise this question to oneccl GitHub repo, which is better equipped to provide you with the specific support you require?

YuningQiu commented 2 months ago

Close this issue for now. Please feel free to reopen this issue or create a new issue if there still any questions or concerns. Thanks a lot!

uniartisan commented 2 months ago

Having official support for DeepSpeed would be great. Also please provide support for Windows version of IPEX, atm I cannot find an official build of Intel oneccl library for Windows, so basically no way to use torch FSDP.

I saw it has been merged into deepspeed

uniartisan commented 2 months ago

We have a PR to extend lighting to support Intel xpu: https://github.com/Lightning-AI/pytorch-lightning/pull/17700 There is ongoing discussion with Lightning community to upstream this PR. Expecting it could unblock your work on Intel GPU.

Could we consider these options to move forward with Pytorch-lighting XPU support?

  1. Temporarily merge upstream changes into Intel's branch for a usable interim version.
  2. If merging isn't feasible, maintain a separate "PyTorch-Lightning-XPU" fork.

This would help us continue development and provide XPU users with a working version. Thoughts on this approach?

It would be great to see this implemented. Appreciate the effort!