Open aditya-y47 opened 1 year ago
Yeah, INSTRUCTOR is highly similar to sentence-transformer in terms of the model architecture. Therefore, any optimization that applies to sentence-transformer models may also be applicable to the INSTRUCTOR models.
Recently, there have been some efforts in model quantization, which you may take as references: https://www.sbert.net/examples/training/distillation/README.html#quantization https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/distillation/model_quantization.py
Hope this helps!
Hey, first up, thank you for building and open sourcing such a great piece of work, I have been using INSTRUCTOR for some time now and I absolutely love it.
I'm planning on working generating embeddings for a large corpus of texts (In Million scale), I intend to schedule the embedding generation job as an aysnc-MQ based execution. Based on some of my initial estimates the run-time estimates are a bit on the higher side, I was hoping certain methods could be used to optimize the generation of embeddings. Some of them include.
Are there any generally prescribed guidelines which would help me achieve these, is anyone here working on such optimizations?