Open yuwenzho opened 9 months ago
Related PR #19125
Hello @yuwenzho Thank you for opening the pull request. We discussed this internally and want to propose that we do this integration in an external plugin repository first instead of moving into core directly. The idea is that we create a separate repository with you as the co-maintainer, for example "lightning-intel" (actual name to be decided) and we provide an easy way to enable it for the user like so:
# Optional dependency
pip install lightning-intel
# In user code:
from lightning_intel import ITREXPrecision
trainer = Trainer(..., precision= ITREXPrecision(mode=...))
We have followed the same approach with other partners like Habana: https://github.com/Lightning-AI/lightning-Habana https://lightning.ai/docs/pytorch/stable/integrations/hpu/intermediate.html
A benefit for you would be that you could make changes to the integration outside of Lightning core faster, for example when a new version of your backend is released.
Thanks, and we would love to hear your thoughts.
cc @lantiga @carmocca @Borda
@awaelchli many thanks for your valuable inputs. we are evaluating this feasibility and will get back to you if we have conclusion here.
@awaelchli Sorry for the late reply. Below is our RFC. We appreciate your thoughts and feedback, please feel free to share any comments or suggestions you have.
Our expected external repository name: Lightning-AI/lightning-Intel
Directory layout:
lightning-Intel
└───src
└───lightning_intel
├───fabric
│ └───plugins
│ ├───precision.py # customized WOQ precision
│ └───io_plugin.py # checkpoint save/load
│
└───pytorch
└───plugins
├───precision.py # customized WOQ precision
└───io_plugin.py # checkpoint save/load
Usage Demo:
# Usage with lightning fabric
from lightning_intel.fabric.plugins import INCPrecision
from lightning.fabric import Fabric
fabric = Fabric(plugins=INCPrecision(mode="int4"))
model = MyModel()
model = fabric.setup(model)
# Usage with pytorch lightning
from lightning_intel.pytorch.plugins import INCPrecision
trainer = Trainer(plugins=INCPrecision(mode="int4"))
trainer.train(model=MyModel(), dataloaders=DataLoader(train_set))
predictions = trainer.predict(model, dataloaders=DataLoader(pred_set))
Hi @awaelchli Could you please help us create Lightning-AI/lightning-Intel repo so that we can start our code contributions? cc @hshen14
Description & Motivation
Hello,
We are the team working on the development of Intel® Extension for Transformers. We would like to discuss the
quantize
feature in relation to our projects.Allow us to provide an introduction to both projects firstly:
We would like to integrate ITREX into the PyTorch Lightning Fabric API. This integration could involve INT8/INT4/FP4/NF4 weight-only quantization feature.
We would like to ask if there is an opportunity for us to make some contributions in this regard.
Thanks
Pitch
Here is a simple use case:
For more details of ITREX 4-bit, please refer to the medium blog of Intel-Optimized Llama.CPP
Alternatives
No response
Additional context
No response
cc @borda