Open roman-dobrov opened 1 year ago
Hi, Thanks a lot for your interest in the INSTRUCTOR model!
The following works for me:
import torch
from InstructorEmbedding import INSTRUCTOR
from torch.nn import Embedding, Linear
from torch.quantization import quantize_dynamic
model = INSTRUCTOR('hkunlp/instructor-large',device='cpu')
qconfig_dict = {Embedding : torch.ao.quantization.qconfig.float_qparams_weight_only_qconfig, Linear: torch.ao.quantization.qconfig.default_dynamic_qconfig}
qmodel = quantize_dynamic(model, qconfig_dict)
torch.save(qmodel.state_dict(),'state.pt')
Hope this helps!
@hongjin-su Thank you for your response! Does loading of quantized model work for you?
Yeah, this seems to work:
>>> import torch
>>> a = torch.load('state.pt')
/home/linuxbrew/.linuxbrew/Cellar/python@3.11/3.11.6/lib/python3.11/site-packages/torch/_utils.py:376: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
device=storage.device,
@hongjin-su
And how do you convert it to the actual model?
torch.load
returns OrderedDict
which is a state dict.
I get the aforementioned error on trying to load_state_dict
before actually using the model
Hello! First of all, great work on instructor.
I'd like to load a quantized model to avoid CPU/memory spikes on my script startup which happen during quantization itself.
I tried static quantization first but it is not supported for SentenceTransformers for float16 or qint8. For dynamic quantization I get the following errors when trying to load a saved state_dict:
I tried two save methods: direct
torch.save(model.state_dict())
and saving traced version withtorch.jit.trace
but both result in the same error. So, is there a way to save/load a quantized model?