8-bit quantitization with PyTorch 2.0

baudm / parseq

Scene Text Recognition with Permuted Autoregressive Sequence Models (ECCV 2022)

https://huggingface.co/spaces/baudm/PARSeq-OCR

Apache License 2.0

569 stars 125 forks source link

8-bit quantitization with PyTorch 2.0 #89

Open eschaffn opened 1 year ago

eschaffn commented 1 year ago

Hey there!

Is it possible to do post-training quantitzation with Parseq? I'm looking for ways to speed up inference time. I tried training a parseq-tiny model but lost about 13% absolute val accuracy.

I'm new to quantitization and am unsure about the types of models it benefits or which type of quantitization to use.

Thanks for any suggestions!

baudm commented 1 year ago

I'm sorry but I'm also new to quantization and model deployments in general.

Another route to take is to use a bigger model, "sparsify" it, then prune the unused connections to optimize inference time.

dat080399 commented 1 year ago

Hey there!

Is it possible to do post-training quantitzation with Parseq? I'm looking for ways to speed up inference time. I tried training a parseq-tiny model but lost about 13% absolute val accuracy.

I'm new to quantitization and am unsure about the types of models it benefits or which type of quantitization to use.

Thanks for any suggestions!

Did you speed up inference time ? And did you quantize with post-training quantization ?

VikasOjha666 commented 10 months ago

I have created the quantization support in a separate fork of this repo which can be checked here https://github.com/VikasOjha666/parseq

By default when the model is trained it will be trained in a quantization-aware training which further helps preserve the accuracy during quantization.