ShuheWang1998 / GPT-NER

208 stars 19 forks source link

Entity embedding similarity is not present #3

Open shrimonmuke0202 opened 1 year ago

shrimonmuke0202 commented 1 year ago

Hi, I go through your work it is fascinating. In your paper, you said that you created a few shot examples using entity embedding, but I could not find the entity embedding part in your released code.

ShuheWang1998 commented 1 year ago

Sorry for the long time no reply. As we detailed in the paper, the method of entity embedding uses a pre-trained NER model to first recognize and extract entity/entities from a sentence. Then the original sentence which the extracted entity belongs to is used to form the prompt. So, for using entity embedding, you just need prepare a pre-trained NER model and extract the few-shot following the sentence-level code.

OStars commented 10 months ago

Sorry for the long time no reply. As we detailed in the paper, the method of entity embedding uses a pre-trained NER model to first recognize and extract entity/entities from a sentence. Then the original sentence which the extracted entity belongs to is used to form the prompt. So, for using entity embedding, you just need prepare a pre-trained NER model and extract the few-shot following the sentence-level code.

Hi, Does "Then the original sentence which the extracted entity belongs to is used to form the prompt" mean the extracted entities and entity types are concatenated at the front of the sentence for sentence-level embedding. For example, there is a location entity "Washington" in "Obama lives in Washington". Then we construct a prompt like "Washington (Location) Obama lives in Washington" to compute sentence embedding and store it in the datastore. In the kNN search phase, we use the entity representation derived from the pre-trained NER model as query to search few-shot demonstration in the datastore. Is that right?

One more question, which pre-trained ner model did you use in your experiments?

Thanks.

swtb3 commented 5 months ago

I have tried reproducing this retrieval method in Chroma-db following the steps in the paper. We find that the entity level embedding actually has a negative impact of f1-score.

I'm wondering if our implementation is off, as the paper is quite ambiguous on how exactly the entity level is handled.

My understanding is:

Creation of vectorstore

  1. Extract entities from the training sentences using a finetuned NER model.
  2. Compute embeddings for these extracted entities. That is "United Nations" not "United", "Nations".
  3. Add embeddings to vectorstore.

Querying vectorstore

  1. Extract entities from query sentence.
  2. Compute embeddings for N extracted entities.
  3. Retrieve top K similar entities for each N extracted entity, yielding K*N retrieved entities (Using Cosine Similarity)
  4. Select top K from that superset (Assumedly based on the same distance metric).
  5. Use associated sentences as context.

@ShuheWang1998 Please can you clarify on this and let us know if this is indeed the proper algorithm to follow, and if so what finetuned model and/or retriever is used? (Or will any model suffice so long as it is finetuned for NER task?)

@OStars @shrimonmuke0202 Can you both confirm if you share the above interpretation of the algorithm?