Closed waterluck closed 11 months ago
See the readme, you just use the endpoints for getting embedding vectors. To use llama2 as the model you would add the relevant ggml model file using the endpoint for that.
On Sat, Nov 18, 2023 at 11:44 PM Suchun Xie @.***> wrote:
Hi, thanks for the great work! I checked the code but I don't find the description and code when extracting sentence embedding from llama(or llama2) model. I'm curious about Q1. how you extract the sentence embedding? and Q2. you take average sentence embedding or last token embedding in the process.
For extract the sentence embedding, I'm using output = model.output(**input, output_hidden_states=True) sentence_embedding = output .hidden_states[-1].mean, or output .hidden_states[-1][:,-1,:] I don't know the difference of these two. I'd appreciate if you can share some knowledge on it!
— Reply to this email directly, view it on GitHub https://github.com/Dicklesworthstone/swiss_army_llama/issues/7, or unsubscribe https://github.com/notifications/unsubscribe-auth/AILNF3QBOODKT2QRISPXCQLYFGFBLAVCNFSM6AAAAAA7ROYN56VHI2DSMVQWIX3LMV43ASLTON2WKOZSGAYDANRZGU3DINQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>
This endpoint:
POST /get_embedding_vector_for_string/: Retrieve Embedding Vector for a Given Text String. Retrieves the embedding vector for a given input text string using the specified model.
And you could add llama2 7b by adding this model:
https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/raw/main/llama-2-7b-chat.Q5_K_M.gguf
with this endpoint:
POST /add_new_model/: Add New Model by URL. Submit a new model URL for download and use. The model must be in .gguf format and larger than 100 MB to ensure it's a valid model file (you can directly paste in the Huggingface URL)
Hi, thanks for the great work! I checked the code but I don't find the description and code when extracting sentence embedding from llama(or llama2) model. I'm curious about Q1. how you extract the sentence embedding? and Q2. you take average sentence embedding or last token embedding in the process.
For extract the sentence embedding, I'm using
output = model.output(**input, output_hidden_states=True) sentence_embedding = output .hidden_states[-1].mean, or output .hidden_states[-1][:,-1,:]
I don't know the difference of these two.I'd appreciate if you can share some knowledge on it!