Closed KnutJaegersberg closed 1 year ago
RE: Smaller: There's the 2.7B: https://huggingface.co/Muennighoff/SGPT-2.7B-weightedmean-msmarco-specb-bitfit & 1.3B: https://huggingface.co/Muennighoff/SGPT-1.3B-weightedmean-msmarco-specb-bitfit
RE: Lower precision I havn't tried loading SGPT models in low precision (the example code loads them in FP32), but it should work. The below code should load it in 8bit:
# pip install -q transformers accelerate bitsandbytes
tokenizer = AutoTokenizer.from_pretrained("Muennighoff/SGPT-5.8B-weightedmean-msmarco-specb-bitfit")
model = AutoModel.from_pretrained("Muennighoff/SGPT-5.8B-weightedmean-msmarco-specb-bitfit", device_map="auto", load_in_8bit=True)
There's https://huggingface.co/blog/hf-bitsandbytes-integration by @ybelkada for more info! :)
RE: Other models Currently, the largest models are the bloom 7b1 / 5.8b one (note that for English the 5.8b performs better). I've been thinking about fine-tuning one of the recent larger models, such as Galactica or LLaMa, but havn't gotten around to it yet!
Thanks. Excited to see how well larger models will work. yes saw the smaller models, used them, but in the benchmarks the larger models seem clearly better. I used other models in transformers in 8 bit/ 16-bit mode for text generation, but I have not yet understood how to extract sentence embeddings. Is there a way to input the 8bit transformer model into the sentence transformer library?
I havn't tried it, but since sentence-transformers just wraps AutoModel it should be possible - worst case you can just modify the call to AutoModel inside the SentenceTransformer
class.
Maybe the code from the below issue helps:
https://github.com/UKPLab/sentence-transformers/issues/1803
Hmm I guess I can use this code here then: https://www.sbert.net/examples/applications/computing-embeddings/README.html#sentence-embeddings-with-transformers
Thanks!
Yeah that's essentially the same code as here: https://github.com/Muennighoff/sgpt#use-sgpt-with-huggingface Make sure to use the weighted mean pooling from the code above 👍
Great work! Currently trying out some of your models. I barely can't fit sgpt-5 billion and sgpt-bloom 7 billion on my rtx 3090. Thus I have two ideas: