Closed donjuanpond closed 3 months ago
I can quantize the model via eetq_quantize but cannot make it via transformers and my result seems to be correct
from PIL import Image
import requests
from eetq import eet_quantize
import torch
# load image from the IAM database
image = Image.open("/path/to/").convert("RGB")
config = EetqConfig("int8")
processor = TrOCRProcessor.from_pretrained('microsoft/trocr-base-handwritten')
model = VisionEncoderDecoderModel.from_pretrained('microsoft/trocr-base-handwritten').to(torch.float16).cuda()
eet_quantize(model, exclude=['output_projection'])
pixel_values = processor(images=image, return_tensors="pt").pixel_values.cuda()
generated_ids = model.generate(pixel_values)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(generated_text)```
Ok, it looks like the exclude=['output_projection']
part of your code, as well as quantizing with eetq package instead of at load time with the config is working. Thank you for your help!
Hello! I'm using EETQ through HuggingFace Transformers to quantize my TrOCR (vision encoder decoder) model. It is meant to generate text output from an image input, transcribing whatever text is shown in the image. I tried to quantize the model through EETQ to speed up inference using the following code:
When I run this quantized model, I get very weird results. The model starts labeling all the text in images as just the word "to". For example, what might have supposed to be labeled "3042846 JG-002" would end up being labeled "to to to to to to to to to" etc. What is causing this problem, and how can I fix it??