4AI / BeLLM

Code for BeLLM: Backward Dependency Enhanced Large Language Model for Sentence Embeddings (NAACL2024)
https://arxiv.org/abs/2311.05296
MIT License
5 stars 0 forks source link

load trained model with transformers.AutoModel? #3

Closed dribnet closed 3 months ago

dribnet commented 3 months ago

Excellent paper! Excited to learn more and do some tests on the trained model listed on huggingface But it seems that one can't be loaded with transformers.AutoModel? Or maybe I'm doing something wrong. 😅

When I try

model = AutoModel.from_pretrained('SeanLee97/bellm-llama-7b-nli').to(device)

I get:

OSError: SeanLee97/bellm-llama-7b-nli does not appear to have a file named config.json. Checkout 'https://huggingface.co/SeanLee97/bellm-llama-7b-nli/tree/main' for available files.

Just making sure as this AutoModel load method worked fine for BeLLM's big sister AnglE (and half-sister 2dmse 😂) - but perhaps those work smoother because they follow more conventional bert architecture.

dribnet commented 3 months ago

Nevermind - this issue goes away if I just pip install angle-emb + billm as the README indicates. 🙃

Note that the requirements.txt does seem to have a typo - the latest angle-emb is 0.3.10 (not 3.1.0). I used 0.3.10 and everything seems to load fine. now.