google-research / bert

TensorFlow code and pre-trained models for BERT
https://arxiv.org/abs/1810.04805
Apache License 2.0
37.44k stars 9.54k forks source link

Is BERT capable of producing semantically close word embeddings for synonyms? #1369

Open niquet opened 1 year ago

niquet commented 1 year ago

Hello everyone, I am currently working on my undergraduate thesis on matching job descriptions to resumes based on the contents of both. Recently, I came across the following statement by Schmitt et al., 2016: "[...] [Recruiters] and job seekers [...] do not seem to speak the same language [...]. More precisely, CVs and job announcements tend to use different vocabularies, and same words might be used with different meanings".

Therefore, I wonder if BERT is able to create contextualized word embeddings that are semantically similar or close for synonyms and semantically dissimilar or distant for the same words that have different meanings in the context of resumes and job postings?

Thank you very much in advance!

Pixelatory commented 6 months ago

It's a bit late, but take a look at the following: https://krishansubudhi.github.io/deeplearning/2020/08/27/bert-embeddings-visualization.html

It allows you to visualize the contextualized embeddings in BERT, and then you'll be able to see for yourself if it holds for your purposes.