Closed Georgepitt closed 1 month ago
@Georgepitt
Thanks for your interest in our work.
The LLM can be used directly with the following code.
import torch
from llm2vec import LLM2Vec
if __name__ == "__main__":
l2v = LLM2Vec.from_pretrained(
"meta-llama/Meta-Llama-3-8B-Instruct",
enable_bidirectional=False,
device_map="cuda" if torch.cuda.is_available() else "cpu",
torch_dtype=torch.bfloat16,
pooling_mode="mean",
)
The encoding steps are same as those mentioned in the README.
Enabling bidirectional connections is handled by the llm2vec
library. For more details on how it is implemented, you can checkout our tutorial. Currently only Llama and Mistral model families are supported. For any other model family, it will need to be implemented separately. You can check #81 and #70 for related discussion.
Thank you for your help! It's really convenient ! (●′ω`●)
No problem! Happy to help!
In your article, you directly use LLM for the embedding task as your baseline. I am curious about how this is done. Can you show me your script? There is also the Enabling bidirectional attention (Bi) section, which sounds like the model needs to be modified. Will using the mntp section files in this repository automatically implement Bi capabilities? Or do you need to modify the model Thank you for your help!