Different model sizes - Githubissues

KnutJaegersberg commented 1 year ago

Great work! Currently trying out some of your models. I barely can't fit sgpt-5 billion and sgpt-bloom 7 billion on my rtx 3090. Thus I have two ideas:

just a little smaller than 5 billion would fit, it is quite a common gpu, would be useful
bigger is better... I want to try 16 bit versions of your models, not sure how, but should be possible, which of cause leads to the question:
could there be sota 20 billion parameter sgpts I can use in 8 bit mode? I'd guess the lower resolution is worth increasing model size performance wise

Muennighoff commented 1 year ago

RE: Smaller: There's the 2.7B: https://huggingface.co/Muennighoff/SGPT-2.7B-weightedmean-msmarco-specb-bitfit & 1.3B: https://huggingface.co/Muennighoff/SGPT-1.3B-weightedmean-msmarco-specb-bitfit

RE: Lower precision I havn't tried loading SGPT models in low precision (the example code loads them in FP32), but it should work. The below code should load it in 8bit:

# pip install -q transformers accelerate bitsandbytes
tokenizer = AutoTokenizer.from_pretrained("Muennighoff/SGPT-5.8B-weightedmean-msmarco-specb-bitfit")
model = AutoModel.from_pretrained("Muennighoff/SGPT-5.8B-weightedmean-msmarco-specb-bitfit", device_map="auto", load_in_8bit=True)

There's https://huggingface.co/blog/hf-bitsandbytes-integration by @ybelkada for more info! :)

RE: Other models Currently, the largest models are the bloom 7b1 / 5.8b one (note that for English the 5.8b performs better). I've been thinking about fine-tuning one of the recent larger models, such as Galactica or LLaMa, but havn't gotten around to it yet!

KnutJaegersberg commented 1 year ago

Thanks. Excited to see how well larger models will work. yes saw the smaller models, used them, but in the benchmarks the larger models seem clearly better. I used other models in transformers in 8 bit/ 16-bit mode for text generation, but I have not yet understood how to extract sentence embeddings. Is there a way to input the 8bit transformer model into the sentence transformer library?

Muennighoff commented 1 year ago

I havn't tried it, but since sentence-transformers just wraps AutoModel it should be possible - worst case you can just modify the call to AutoModel inside the SentenceTransformer class. Maybe the code from the below issue helps: https://github.com/UKPLab/sentence-transformers/issues/1803

KnutJaegersberg commented 1 year ago

Hmm I guess I can use this code here then: https://www.sbert.net/examples/applications/computing-embeddings/README.html#sentence-embeddings-with-transformers

KnutJaegersberg commented 1 year ago

Thanks!

Muennighoff commented 1 year ago

Yeah that's essentially the same code as here: https://github.com/Muennighoff/sgpt#use-sgpt-with-huggingface Make sure to use the weighted mean pooling from the code above 👍

Muennighoff / sgpt

Different model sizes #19