NetEase-FuXi / EETQ

Easy and Efficient Quantization for Transformers
Apache License 2.0
174 stars 14 forks source link

Does it support Vision Transformers? #21

Closed PaulaDelgado-Santos closed 3 weeks ago

PaulaDelgado-Santos commented 4 months ago

Hi, I would like to know if ViT supports Eetq and LoRA and if I can have an example of this: `from transformers import ViTForImageClassification, ViTImageProcessor from peft import get_peft_model, LoraConfig original_model = ViTForImageClassification.from_pretrained('google/vit-base-patch16-224-in21k') peft_config = LoraConfig( r=8, lora_alpha=32, lora_dropout=0.01, target_modules=["query", "value"], task_type="FEATURE_EXTRACTOR", ) peft_model = get_peft_model(original_model, peft_config)

Apply quantization to all layers except lora

`

Thank you :)

SidaZh commented 4 months ago

@PaulaDelgado-Santos https://github.com/huggingface/peft/blob/cb0bf077744d11524ec6f68d920f4cfe4ef3e8f3/tests/test_gpu_examples.py#L2698 EETQ currently supports AutoModelForCausalLM in transformers and peft. Use lora only for inference. For more general use, you can refer here: https://github.com/NetEase-FuXi/EETQ/blob/1e89c802ecd4b106d1df2afb3ded51f05f8d3c67/examples/models/llama_transformers_example.py#L165

# Firstly merge lora
model = get_peft_model(model, peft_config)
model = model.merge_and_unload()
# Then quantize with eetq
eet_quantize(model, init_only=False, include=[nn.Linear], exclude=["xxx"], device="cuda:0")
NetEase-FuXi commented 3 weeks ago

This issue was closed because it has been stalled for 30 days with no activity. Please feel free to reopen it if needed.