Open Yan2266336 opened 1 month ago
Hi, can you share a code snippet to reproduce the issue?
ok. Firstly, I converted the foundational llama 3-8b-instruct model into an embedder, named "Llama-3-8B-instruct-Emb".
Then, I follow your instructions to reload this model and use this model to get embedding.
In the end, I use "l2v.encode()" to generate embedding.
It is just an example of this issue. Most sentences can generate embedding correctly, but some sentences will raise this problem. So, I don't know how to solve this issue.
Hi @Yan2266336,
I believe you are mixing two different ways of loading the model. The loading using transformers
and trust_remote_code
is required for loading Huggingface models, which have custom files that are needed for bidirectional attention. l2v.save
will not save those files, hence if you load from a local directory, you are actually using a unidirectional model instead of bidirectional.
If you are saving using l2v.save
, then you should be loading the model with l2v = LLM2Vec.from_pretrained
method.
This code snippet is working without any errors on my end
import torch
from llm2vec import LLM2Vec
if __name__ == "__main__":
l2v = LLM2Vec.from_pretrained(
"meta-llama/Meta-Llama-3-8B-Instruct",
device_map="cuda",
torch_dtype=torch.bfloat16,
low_cpu_mem_usage=True,
)
l2v.encode(["Buddhism (religion/philosophy)"])
Thank you so much for helping me to solve this issue. However, arises another issue when I use this code to get embeddings, just as shown in the figure.
Do you know what the problem is? Because I have tried this way to get embeddings, but it didn't work. Therefore, I used the code you shared in the hugging face to get the embeddings.
This is a known issue, please upgrade to latest version of llm2vec, in which this issue is resolved
pip install llm2vec==0.1.8
It works. Thanks for helping me to solve these problems.
No problem. Feel free to re-open the issue if you have any more questions
Hello, I have to bother you again. recently I Instruct-tuned a 'meta-llama/Meta-Llama-3-8B-Instruct' model and pushed it into my hugging face, this model is 'YBXL/Meta-Llama-3-8B-InstUMLS-Concept-train11e-06'. The model's structure is shown here:
However, I used the same way to load my model into the llm2vec framework, and the previous issue arose again.
I also tested the foundational llama-3-8b-instruct model and my previous fine-tuned llama model, they are still working. only the lasted one "YBXL/Meta-Llama-3-8B-InstUMLS-Concept-train11e-06", the llm2vec arose this problem. Could you please help me to solve this problem? Thank you so much.
It seems like there are some issues with responding, I didn't see any of your responses.
@Yan2266336, the issue arises because the input text is not put in a proper template. This step happens here, but because the model name has changed, the if condition is not satisfied. Here is a quick workaround to override the function and apply Llama-3 template.
from llm2vec import LLM2Vec
import torch
class CustomModel(LLM2Vec):
def prepare_for_tokenization(self, text):
text = (
"<|start_header_id|>user<|end_header_id|>\n\n"
+ text.strip()
+ "<|eot_id|>"
)
return text
l2v = CustomModel.from_pretrained(
"YBXL/Meta-Llama-3-8B-InstUMLS-Concept-train11e-06",
device_map="cuda",
torch_dtype=torch.bfloat16,
low_cpu_mem_usage=True
)
l2v.encode(["Buddhism (religion/philosophy)"])
In the future, the prompt template will be specified outside the package (#56 )
Thank you so much. So, in the future, I just need to define a custom model as you described here to fit the prompt template, right? Unless you design a prompt template outside the package later on.
Yes, exactly. Your understanding is correct.
I just followed your instructions of code to convert the llama 3-8b-instruct model into an embedded. When I use the l2v.encode() function to get embedding of sentence "Buddhism (religion/philosophy)", the code will raise an issue about "RuntimeError: The expanded size of the tensor (10) must match the existing size (12) at non-singleton dimension 0. Target sizes: [10]. Tensor sizes: [12]". Just some special words can raise this kind of issues. So do you know the reason of this problem?