huggingface / optimum-intel

🤗 Optimum Intel: Accelerate inference with Intel optimization tools
https://huggingface.co/docs/optimum/main/en/intel/index
Apache License 2.0
388 stars 110 forks source link

add XPU support for `IPEXModel.from_pretrained` #704

Closed faaany closed 4 months ago

faaany commented 4 months ago

What does this PR do?

This PR adds XPU support for loading a Torchscript model using IPEXModel.from_pretrained. Below is a test example:

import torch 
from transformers import AutoTokenizer
from optimum.intel import IPEXModel

model_id = "faaany/bert-base-uncased-float32-traced"
model_id_tokenizer = "google-bert/bert-base-uncased"
model = IPEXModel.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id_tokenizer)
inputs = tokenizer("Paris is the capital of France.", return_tensors="pt").to("xpu")
print("________________")
print(model.device)
with torch.no_grad():
    outputs = model(**inputs)
    embeddings = outputs[0][:, 0]
    print(embeddings)

Please note that faaany/bert-base-uncased-float32-traced is just a test model, which is traced from "google-bert/bert-base-uncased". Since I didn't load any tokenizer config, so I pass model_id_tokenizer to AutoTokenizer in the example above.

faaany commented 4 months ago

@echarlaix pls have a review. Thx!

faaany commented 4 months ago

@yao-matrix

faaany commented 4 months ago

Hi @IlyasMoutawwakil could you help review this PR? Thanks a lot!

HuggingFaceDocBuilderDev commented 4 months ago

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

echarlaix commented 4 months ago

To fix the code style test you can do the following :

pip install .[quality]
make style
faaany commented 4 months ago

Hi @echarlaix, there are some CI tests failing. Do you know what I can do to fix them? Thx!