openai / gpt-2

Code for the paper "Language Models are Unsupervised Multitask Learners"
https://openai.com/blog/better-language-models/
Other
22.57k stars 5.53k forks source link

How to get sentence embeddings using gpt2 #322

Open Pranav082001 opened 1 year ago

Pranav082001 commented 1 year ago

Hi, I have been doing some experimentation for getting sentence embedding using gpt2 for semantic similarity task. But failing to get desired results. Gpt is left to right autoregressive language model, so thought it might be good at picking up context of the whole sentence.

I tried 2 approaches for getting representation of the sentence.

1) Mean pooling of embeddings of all input words present in sentence 2) Embedding of last word

But nothing seems to work well. On the other hand have also tried the same with gpt3 API and the quality of embeddings seems good. Evaluated the embeddings subjectively on the task of semantic similarity.

Thanks in advance.